Next Article in Journal
90% Accuracy for Photoplethysmography-Based Non-Invasive Blood Glucose Prediction by Deep Learning with Cohort Arrangement and Quarterly Measured HbA1c
Next Article in Special Issue
Hybrid Deflection of Spoiler Influencing Radar Cross-Section of Tailless Fighter
Previous Article in Journal
Automated Curb Recognition and Negotiation for Robotic Wheelchairs
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Combining Regional Energy and Intuitionistic Fuzzy Sets for Infrared and Visible Image Fusion

1
College of Electronic Information Engineering, Changchun University, Changchun 130012, China
2
School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
*
Author to whom correspondence should be addressed.
Sensors 2021, 21(23), 7813; https://doi.org/10.3390/s21237813
Submission received: 1 November 2021 / Revised: 15 November 2021 / Accepted: 19 November 2021 / Published: 24 November 2021
(This article belongs to the Special Issue Instrument and Measurement Based on Sensing Technology in China)

Abstract

:
To get more obvious target information and more texture features, a new fusion method for the infrared (IR) and visible (VIS) images combining regional energy (RE) and intuitionistic fuzzy sets (IFS) is proposed, and this method can be described by several steps as follows. Firstly, the IR and VIS images are decomposed into low- and high-frequency sub-bands by non-subsampled shearlet transform (NSST). Secondly, RE-based fusion rule is used to obtain the low-frequency pre-fusion image, which allows the important target information preserved in the resulting image. Based on the pre-fusion image, the IFS-based fusion rule is introduced to achieve the final low-frequency image, which enables more important texture information transferred to the resulting image. Thirdly, the ‘max-absolute’ fusion rule is adopted to fuse high-frequency sub-bands. Finally, the fused image is reconstructed by inverse NSST. The TNO and RoadScene datasets are used to evaluate the proposed method. The simulation results demonstrate that the fused images of the proposed method have more obvious targets, higher contrast, more plentiful detailed information, and local features. Qualitative and quantitative analysis results show that the presented method is superior to the other nine advanced fusion methods.

1. Introduction

Infrared (IR) and visible (VIS) images fusion focuses on synthesizing multiple images into one comprehensive image, which can be applied in face recognition [1], target detection [2], images enhancement [3], medicine field [4], remote sensing [5], and so on. The source images applied in image fusion come from different sensors. The IR sensor can capture the heat information radiated by objects. IR images have a low spatial resolution, less background information, poor imaging performance, and high contrast pixel intensities. In contrast, the VIS images provide abundant background, rich detailed texture information, and a high spatial resolution. Hence, the effective fusion of the two types of images will provide more useful information and better human visual effects, and that is beneficial for the subsequent research work [6,7].
The selection of the fusion rule is very crucial, and it decides the fusion effects. The essence of image fusion is how to reasonably choose the valuable pixels of multiple source images and integrate them into one image. Image fusion can be considered as the transfer of image information. The process is actually a many-to-one mapping, which contains strong uncertainty. To solve the problem, the energy-based fusion strategy is often used to enhance the image quality and reduce the uncertainty. Zhang [8] has presented a RE-based fusion rule for the IR and VIS image fusion which can preserve more prominent infrared targets information. Srivastava [9] has proposed a local energy-based method to fuse the multi-modal medical images that can obtain better fusion performance. Liu [10] has presented the average-RE fusion rule to fuse the multi-focus and medical images, and the results show that the fused image contains more information and edge details. Thanks to the consideration of the correlation of the pixels, the energy-based fusion strategies can overcome the uncertainty of improper pixels selection and improve the quality of the fused image to some extent.
In image fusion, the possibility that uncertainty and ambiguity occur can be considered extremely likely (due to sampling techniques, noising, blurring edges,...). Therefore, it is imperative the implementation of adaptive items to manipulate data uncertainty. Scientific research has produced a lot of good results by fuzzy logic and techniques. Versaci proposed a fuzzy geometrical approach to control the uncertainty of the image [11]. As the extension of the fuzzy sets (FS) theory, intuitionistic fuzzy sets (IFS) is described by membership, non-membership, and hesitation degrees which are more flexible and practical than FS in dealing with fuzziness and uncertainty [12]. In recent years, many relative methods have been developed in the field of image fusion. T. Tirupal [13] has presented Sugeno’s intuitionistic fuzzy set (SIFS) -based method to fuse multi-modal medical images. The image obtained by this algorithm can distinguish the edge of soft tissue and blood vessel clearly, which is helpful for case diagnosis. C. H. Seng [14] has proposed a method based on probabilistic fuzzy logic to fuse through-the-wall radar images, and the results show that fused image has a higher contrast to help improve the detection rate of the target. Zhang [12] has designed a method based on fractional-order derivative and IFS for multi-focus image fusion, and the results show that the method can avoid the artifacts and preserve the detailed information. It can be proved that IFS can solve the problems that existed in the image fusion process, which is suitable for image fusion.
In this paper, we combine RE and IFS to design a new image fusion strategy to enhance the fused image quality. To better extract the detailed features, we use non-subsampled shearlet transform (NSST) to decompose the IR and VIS images to get low- and high-frequency sub-bands. For the high-frequency sub-bands, the ‘max-absolute’ rule is adopted to obtain the fused detailed information. For the low-frequency sub-bands, the new fusion strategy is implemented to achieve the fused low-frequency components, and the strategy can be described by two steps as follows. Firstly, the RE-based fusion rule is performed on the low-frequency layers to get a pre-fusion image, which allows more target information preserved in the resulting image. Then, the IFS is introduced to obtain the final fused images, which enables more texture information to be transferred to the resulting image. We use the inverse NSST to reconstruct the final fused result. Simulation experiments on the public datasets demonstrate that this method outperforms other advanced fusion methods. The fused images have better stable quality, more obvious targets, higher contrast, more plentiful detailed information, and local features.
The rest of the paper is arranged as follows. The basic principle of NSST and fuzzy theory are reviewed in Section 2. The fusion rules proposed in this study are introduced in Section 3. Experiments and results analysis are presented in Section 4. The paper is summarized in Section 5.

2. Related Works

2.1. Basic Principle of NSST

The fusion methods based on multi-scale geometric analysis (MGA) are widely used in IR and VIS image fusion [15,16,17,18,19,20,21,22,23]. MGA tools can represent the images at different scales and different directions, and these characteristics are helpful to extract more detailed information of the images. Among these MGA tools, NSST is regarded as the most popular one [24]. Many researchers have proved that the fused images of NSST-based method are more suitable for the human visual system. NSST is proposed by K. Guo and G. Easley et al. [25,26], and the model of NSST can be described as follows.
Assume n = 2, the affine systems with composite dilations are the collections of the form [25]:
ψ A B ( ψ ) = { ψ j , l , k ( x ) = | det A | j 2 ψ ( B l A j x k ) : j , l Z , K Z 2 }
where ψ L 2 ( R 2 ) , A is a 2 × 2 invertible matrix, so is B . By choosing ψ , A , and B appropriately, we can make ψ A B an orthonormal basis or, more generally, a Parseval frame (PF) for L 2 ( R 2 ) . Typically, the members of B are shear matrices (all eigenvalues are one), while the members of A are matrices expanding or contracting on a proper subspace of R 2 . These wavelets are of interest in applications because of their tendency to produce “long, narrow” window functions well suited to edge detection.
Assume A a = [ a 0 0 a ] , B s = [ 1 s 0 1 ] , the shearlet system is shown as Equation (2), ψ a s t ( x ) is a shearlet [26]. Shearlet can be considered as a special example of composite wavelets in L 2 ( R 2 ) , whose elements range not only at various scales and locations, like wavelets, but also at various orientations [27].
ψ a s t ( x ) = { a 3 4 ψ ( A a 1 B s 1 x t ) , a R + , s R , t R 2 }
Figure 1 shows the NSST decomposition structure of two levels. The source image f is decomposed into a low-pass image f a 1 and a band-pass image f d 1 by a non-subsampled pyramid (NSP). After that, the NSP decomposition of each layer is iterated on the low-frequency component obtained from the upper layer decomposition. The shearlet filter banks are used to decompose f d 1 and f d 2 to attain the high-frequency sub-bands coefficients.

2.2. Fuzzy Set Theory

Zadeh presented the fuzzy set (FS) theory in 1965 [27]. According to the FS theory, the membership degree is used to quantify the uncertain information expressed by the interval [0,1]. A value between 0 and 1 is used to represent the membership degree. The value of 0 means the non-membership, and the value of 1 means the full membership. The sum of the element membership degree is 1.
FS theory is good at representing qualitative knowledge with unclear boundaries, which plays a vital role in eliminating the vagueness that existed in images [28]. Many studies show that the image fusion methods based on FS theory are superior to other conventional algorithm models. The composite methods that combine the FS theory with other representation methods can strictly select reliable pixel information of source images [29].
According to general set theory, there is a relationship of belonging to or not belonging between elements and sets. Let U denote the universe, u U , A U , and the characteristic function χ A of A is defined as follows [30]:
χ A : U { 0 , 1 } u χ A ( u ) { 0 , 1 } }
χ A = { 1 , u A 0 , u A
In FS theory, the definition of membership function evolves from the characteristic function in general set theory. Let A _ denote a fuzzy subset of U , and the membership function μ A _ can be defined as follows [30]:
u A _ : U [ 0 , 1 ] u μ A _ ( u ) [ 0 , 1 ] }
It can be seen from Equation (5), the FS theory is established based on the membership function. Therefore, the membership function is very important in fuzzy mathematics.
The image with the resolution of M × N can be seen as a fuzzy pixel set, as follows [30]:
X = U i = 1 M U j = 1 N u i j x i j
where x i j means the grayscale value of the pixel ( i , j ) . μ i j belongs to [ 0 ,   1 ] , which represents the membership degree of the pixel ( i , j ) . { μ i j } represents the fuzzy characteristics plane, which is composed of all μ i j . μ i j is calculated by the membership degree function. Different membership degree functions can obtain different μ i j . Therefore, it is convenient to adjust the μ i j to acquire different enhancement effects.

3. Proposed Method

In this section, we propose a new fusion strategy for IR and VIS images: a combining RE and IFS method in NSST domain (RE-IFS-NSST). Figure 2 illustrates the overall framework of the algorithm. The fusion process can be mainly divided into 4 parts: NSST decomposition, the low-frequency sub-bands fusion, the high-frequency sub-bands fusion, and the NSST reconstruction.

3.1. NSST Decomposition

In this paper, the NSST is used to decompose the source images. The IR and VIS images are decomposed by NSST to obtain high- and low-frequency sub-bands according to the Equations (7) and (8). The I R L and V I S L are the low-frequency sub-bands of the IR and VIS images, respectively; the I R H j , k and V I S H j , k are the high-frequency sub-bands of the IR and VIS images at the j level with the k direction, respectively.
{ I R L , I R H j , k } = N S S T _ D E ( I R )
{ V I S L , V I S H j , k } = N S S T _ D E ( V I S )
where the N S S T _ D E ( · ) represents the NSST decomposition function of the input image.

3.2. The Rule for Low-Frequency Components

The low-frequency components contain most of the energy information such as contour and background information [31]. In this paper, the RE and IFS are used to construct the fusion rule for the low-frequency components. The method consists of two steps: (1) the pre-fusion based on RE; (2) the final fusion based on IFS.
In the low-frequency of IR images, the salient targets are usually located in regions that have large energy. The fusion rule based on RE can transmit the energy information of the IR images to the fused image which can achieve better performance on the extraction of the target information. Therefore, we firstly adopt RE-based fusion rule to get the pre-fusion image. Based on the pre-fusion image, we secondly adopt the IFS-based method to get the final result. IFS can be described by membership, non-membership, and hesitation degrees at the same time. In accordance with the membership degree, the pixels of the source images can be easily, precisely, and effectively classified into targets and background information. By means of the IFS-based method, the texture information of VIS images can be transferred to the resulting image.
(1)
The pre-fusion based on RE
I R L and V I S L represents the low-frequency components of IR and VIS images, respectively. I R L and V I S L are firstly fused based on the RE-based fusion rule to achieve the pre-fused low-frequency image. The RE is calculated as follows [32]:
E S ( m , n ) = ( i , j ) Ω ( m , n ) A S 2 ( i , j ) W ( i , j )
where E s ( m , n ) is the energy of the region centered on the point ( m , n ) , s represents IR or VIS; Ω ( m , n ) is the neighborhood window centered on the point ( m , n ) ; A s ( i , j ) is the low-frequency coefficient of the point ( i , j ) ; W ( i , j ) is the function value of the mask window of the point ( i , j ) . The window function W with a size of 3 × 3 can be expressed as Equation (10) [32]:
W = 1 16 [ 1 2 1 2 4 2 1 2 1 ]
Based on RE, the low-frequency image of IR and VIS are fused by the weighted average RE rule. The weights are shown in Equations (11)–(13).
w 1 = E I R E I R + E V I S
w 2 = 1 w 1
f = w 1 × I R L + w 2 × V I S L
where w 1 and w 2 are the fusion weights, f is the pre-fusion image.
(2)
The final fusion based on IFS
The IFS is introduced to calculate the membership degree of the IR and VIS low-frequency images, and the pre-fusion image is used as the reference to assist the final low-frequency image fusion. Figure 3 shows the low-frequency sub-bands fusion framework.
Gauss membership function is used to represent the membership degree of the coefficients, and the final low-frequency image is fused in accordance with the membership degree after defuzzification. The membership u I R and non-membership v I R of I R L are respectively shown as follows [33]:
u I R ( x , y ) = exp [ ( I R L ( x , y ) ε I R ) 2 2 ( k 1 σ I R ) 2 ]
ν I R ( x , y ) = 1 exp [ ( I R L ( x , y ) ε I R ) 2 2 ( k 2 σ I R ) 2 ]
where ε I R represents the average value of I R L , σ I R represents the standard deviation. k 1 and k 2 are Gaussian function adjustment parameters. Hesitation degree π I R is obtained by u I R and v I R . π I R is calculated as follows:
π I R ( x , y ) = 1 u I R ( x , y ) ν I R ( x , y )
The difference correction method is used to transfer the IFS into FS. The U I R ( x , y ) is the membership degree of FS, and which is calculated as below [34]:
U I R ( x , y ) = u I R ( x , y ) + π I R ( x , y ) × ( 0.5 + u I R ( x , y ) ν I R ( x , y ) 2 )
Similarly, the u V I S , v V I S , π V I S and U V I S of VIS low-frequency image can be calculated according to Equations (14)–(17). According to the empirical value, k 1 and k 2 are set to 0.8 and 1.2 respectively.
It can be seen from Equations (14)–(17) that the large gray value of the pixel corresponds to the small membership value. Therefore, in IR images, the targets own the smaller membership value. In VIS images, the background and texture features own the smaller membership value. The membership degree can be used to determine which valuable pixels can be integrated into the final resulting image. Using the pre-fusion image f as the reference image, the U I R and U V I S are compared to get the decision map to generate the final fused image F L . The specific fusion rule is defined as below:
F L = { V I S L , U I R U V I S f , U I R < U V I S

3.3. The Rule for High-Frequency Components

Different from the low-frequency sub-bands, the high-frequency sub-bands are usually used to reflect the texture and contour information of the source images. The edges and contours are important information carrying points used to display the visual structure of the image. Edges and contours often correspond to the pixels with a sharp decrease in brightness information. Therefore, we adopt the max-absolute fusion rule to fuse the high-frequency sub-bands to get rich texture information. The fusion rule can be described as follows:
F H j , k = { I R H j , k , | I R H j , k | > | V I S H j , k | V I S H j , k , | I R H j , k | | V I S H j , k |
where F H j , k is the final fusion results of the high-frequency sub-bands.

3.4. NSST Reconstruction

The fused image F is reconstructed by the inverse NSST transform according to Equation (20).
F = N S S T _ R E C ( F L , F H j , k )
where the N S S T _ R E C ( · ) represents the inverse NSST transform function; F is the output image.
The proposed RE-IFS-NSST method is summarized in Algorithm 1.
Algorithm 1. The proposed RE-IFS-NSST fusion algorithm.
Input: Infrared image (IR), Visible image (VIS)
Out: Fused image (F).
  • The IR and VIS are decomposed by NSST to obtain the coefficients of high- and low-frequency components { I R L , I R H j , k } and { V I S L , V I S H j , k } according to Equations (7) and (8);
  • Calculate the fusion weight of the low-frequency images I R L and V I S L according to the Equations (6)–(11) to get the pre-fusion result;
  • According to Equations (12)–(16) the final fusion result of low-frequency part F L is obtained;
  • Calculate the { I R H j , k , V I S H j , k } membership degree according to Equation (17) to obtain the F H j , k ;
  • Reconstruct { F L , F H j , k } to get the final fusion result F according to Equation (18).

4. Experimental Results

4.1. Datasets

In order to test the effectiveness of the proposed method, the experiments are conducted on two public datasets, TNO Image Fusion Dataset and RoadScene Dataset, which are widely used in the field of IR and VIS image fusion. We chose 6 sets of IR and VIS images in TNO dataset and 5 sets of IR and VIS images in RoadScene dataset. All the images pairs can be downloaded from: https://github.com (accessed on 15 July 2021).

4.2. Experimental Setting

In order to test the practicability and effectiveness of the proposed method, we set up two groups of experiments. The first group compares the proposed method with RE-NSST and IFS-NSST methods, and the second group compares the proposed method with the other nine advanced fusion methods which are FPDE [35] (fourth-order partial differential equations), VSM [36] (visual saliency map), Bala [37] (Bala fuzzy sets), Gauss [34] (Gauss fuzzy sets), DRTV [38] (Different Resolutions via Total Variation Model), LATLRR [39] (latent Low-Rank Representation), SR [40] (sparse regularization), MDLatLRR [41] (decomposition method based on latent low-rank representation) and RFN-Nest [42] (residual fusion network).
The experimental parameters are set as follows:
(1)
The computer is configured as 2.6 Hz Intel Core CPU and 4GB memory, and all experimental codes run on the Matlab2017 platform.
(2)
In the proposed method, the ‘maxflat’ is chosen as the pyramid filter. The numbers of decomposition level and directions are 3 and {16,16,16}, respectively.
(3)
In the RE-NSST and IFS-NSST methods, the parameters of NSST are the same as that of the proposed method. The calculation of RE and IFS are the same as that of the proposed method.
(4)
In Bala and Gauss methods, the ‘9-7′ and ‘pkva’ are chosen as the pyramid filter and the directional filter respectively, and the decomposition scale is 3.
(5)
In the MDLatLRR method, the decomposition level selection 2.
(6)
The parameters of the other 9 methods are set following the best parameter setting reported in the corresponding papers.

4.3. Quantitative Evaluation

In this section, six types of quantitative evaluation indexes are introduced to evaluate the performance of every algorithm objectively, including E, AG, MI, CE, SPD, and PSNR. Among the indexes, the E and AG are non-reference indexes; the MI, CE, SPD, and PSNR are reference-based indexes. MI and CE utilize IR and VIS images as the reference to calculate the similarity and difference between the source image and the fused image. SPD and PSNR use the VIS image as the reference to reflect the interference information from the VIS image into the fused image. To comprehensively evaluate the fusion performance from different aspects, we employ both the reference-based and no-reference indexes in this study.
(1)
Entropy (E) [43]
E describes the average amount of information of the source image, and it is calculated using Equation (21):
E = i = 1 m p i log 2 p i
where L represents the total gray level, p i is the probability of the gray value i .
(2)
Average Gradient (AG) [43]
AG reflects the micro-details contrast and texture features variation of the fused image. It can be expressed as below:
A G = 1 M × N m = 1 M n = 1 N Δ F x 2 ( m , n ) + Δ F y 2 ( m , n ) 2
where the area with the pixel ( m , n ) as the center and size M × N , Δ F x and Δ F y are the difference in two directions of the fused image F .
(3)
Mutual Information (MI) [44]
MI reflects the amount of information transferred from the source images to fused image, which can be calculated as follows:
M I X , F = i = 1 L j = 1 L P X , F ( i , j ) log 2 ( P X , F ( i , j ) P X ( i ) P F ( j ) )
M I = M I I R , F + M I V I S , F
where P X and P F respectively represent the gray distribution of the source images and the fused image, and P X , F are the joint probability distribution density. The sum of M I A , F and M I B , F denotes the mutual information value.
(4)
Cross Entropy (CE) [44]
CE reflects the different degree of gray distribution between fusion image and source images, which is defined as below:
C E ( I R , F ) = i = 1 L p i log 2 ( p i q i )
C E ( V I S , F ) = i = 1 L v i log 2 ( v i q i )
C E ( I R , V I S , F ) = C E 2 ( I R , F ) + C E 2 ( V I S , F ) 2
where L is gray levels, p i , ν i and q i are the probability of the detected pixel with gray value i appearing in the IR, VIS, and the fused images.
(5)
Spectral Distortion (SPD) [45]
SPD reflects the degree of color distortion between the fused image and the VIS image, the expression is shown in Equation (28):
S P D = 1 M × N i = 1 M n = 1 N | F ( i , j ) A ( i , j ) |
where F ( i , j ) and V I S ( i , j ) represent the gray values of the fused image and the VIS image at ( i , j ) respectively.
(6)
Peak signal to noise ratio (PSNR) [44]
PSNR is mainly used to measure the ratio between effective information and noise of image, and it can illustrate whether the image is distorted. It can be given as below:
M S E = 1 M × N i = 1 M j = 1 N ( F ( i , j ) V I S ( i , j ) ) 2
P S N R = 10 lg Z 2 M S E
where F ( i , j ) and V I S ( i , j ) represent the gray values of the fused image and the source image at ( i , j ) respectively. MSE is the mean square error, and it reflects the degree of difference between variables. Z represents the difference between the maximum and minimum gray value of the source image.

4.4. Fusion Results on the TNO Dataset

4.4.1. Comparison with RE-NSST and IFS-NSST Methods

In the first group of simulation experiments, we firstly compare three methods, the RE-NSST, IFS-NSST, and proposed methods. The qualitative comparison results are shown in Figure 4.
As shown in Figure 4, the three methods can fuse the source images, but the results are different. The RE-NSST method can achieve relatively obvious targets, but the fused images have the problems of low contrast and image blurring. The fused images of the IFS-NSST method have higher contrast and more detailed information, but the problems of false contour and block effects are inevitable. The fused images have poor visual effects. Compare with the RE-NSST and IFS-NSST methods, the proposed method can extract the complete infrared targets and continuous and clear edge details. The fused images are more suitable for human-eye observation.
The quantitative comparison results are shown in Table 1. We use red and blue to represent the best and second-best results respectively. For the E, AG, MI, and PSNR, the large values mean better performance. For the CE and SPD, the low values mean better fusion performance. Except for AG, the proposed method is superior to RE-NSST and IFS-NSST in other parameters, which means that the proposed method has the best comprehensive performance.

4.4.2. Comparison with the State-of-the-Art Methods

In the second group of the experiment, we compare the proposed method with other advanced methods. Figure 5 shows the qualitative fusion results on the TNO dataset. As shown in Figure 5, the compared Bala method can achieve complete infrared targets, but the background is blurring. The Gauss method can preserve more texture features (such as the edges of the window and road). However, the fused images have the problems of obvious block effects. The FPDE lost the detailed edge information (e.g., roads, trees, shrubs, windows, or street lights) and make the scene recognition difficult. The infrared targets information is highlighted in the fusion image of DRTV, but the background is blurred and the edges information is lost seriously. VSM, LATLRR, MDLatRR, SR, and RFN can achieve relatively rich details, but the brightness of the infrared targets is low. Compared with the nine methods, the proposed method can obtain better image performance. The fused images have more obvious infrared targets and abundant background and detailed information, which are more suitable for the human visual system.
Figure 6 displays the quantitative fusion results of the ten methods. The results show that our proposed method gets the best performance on 4 objective evaluation metrics (E, AG, SPD, PSNR) and the second-best performance on 2 objective evaluation metrics (MI, CE). The differences between the best MI and CE are small, superior to other compared methods.

4.4.3. Analysis

We use the ‘2_Men in front of house’ image as the example to further illustrate the superiority of this algorithm. We enlarge the region contained in the red rectangle in the fused image by the same multiple. The quantitative comparison results of all ten methods are shown in Figure 7.
The magnifying detailed images illustrate that the proposed method has the following advantages:
(1)
The proposed method can transfer more detailed textures features of shrub and tree to the resulting image.
(2)
The proposed method can preserve obvious infrared targets information in the resulting image.
(3)
The proposed method can improve the image contrast and brightness.
The objective evaluation results of ten methods on ‘2_Men in front of house’ image are shown in Table 2. Table 2 illustrates that the proposed method obtains the best results on the 6 parameters. In general, the proposed method performs better than other compared methods.

4.5. Fusion Results on the RoadScene Dataset

To further evaluate the applicability of the proposed method, we conduct experiments on the RoadScene dataset. The images on the RoadScene datasets contain rich road traffic scenes such as vehicles, pedestrians, and roads. The VIS images in the RoadScene dataset are color images. In the fusion experiment, they are first converted to gray images. We choose 5 pairs of typical images to carry on the comparison experiments. We enlarge the details of the region contained in the red rectangle in the fused images. The comparison results are shown in Figure 8.
Figure 8 shows that the fusion image obtained by SR, and RFN methods are darker than other images. The infrared targets of FPDE and VSM methods are not prominent. Although the DRTV method can get an obvious target, it can’t identify the background texture features. The Bala method loses the image detail information. The fusion images of LATLRR and MDLatLRR methods have lower contrast. The fused images of the proposed method can achieve plentiful detailed texture features and apparent infrared targets. The qualitative results show that the proposed method can obtain better fusion performance, which is also applicable for the RoadScene dataset.
Figure 9 shows the quantitative data results on the RoadScene dataset of all the fusion methods. The results show that our proposed method performs best on 4 objective evaluation metrics (E, AG, SPD, PSNR) and the second-best on the other two metrics (MI and CE). Our proposed method achieves a lower value of MI and a higher value of CE. The reason is maybe that we discard some useless interference information, such as noise information, which existed since the image was collected by the RoadScene dataset. Therefore, the similarity between the fusion image and the source image is relatively small, resulting in lower MI and higher CE values.
The results on the RoadScene dataset are basically consistent with those on the TNO dataset. Qualitative and quantitative experimental results show that the proposed method can generate fused images with prominent targets and abundant details, which is more appropriate for the human visual system.

4.6. The Computational Complexity Analysis

In order to analyze the computational complexity, we calculate the running time of different methods when fusing the above image pairs of TNO and RoadScene datasets. All experiments were performed under the same conditions. The results of the running time are shown in Table 3 and Table 4. The best and second-best running time is displayed in red and blue respectively. The results of the proposed method are shown in bold.
According to Table 3 and Table 4, the DRTV shows the best calculation efficiency than other fusion methods, but the fusion performance is not the most satisfactory. The calculation complexity of VSM is lower on TNO dataset, but it cannot get the same results on the RoadScene dataset. LATLRR and MDLatLRR are relatively long. The running time of the MDLatLRR method is closely related to the decomposition level. As the number of decomposition levels increases, the running time of the algorithm becomes increasingly longer. Although the proposed method cannot obtain the lowest running time, the fusion quality is superior to other methods. At the same time, the method is more stable when dealing with different datasets.

5. Conclusions

In this paper, we present a new fusion method employing RE and IFS in the NSST domain for IR and VIS images. Thanks to the RE-IFS-NSST fusion strategy, the fusion image can have apparent infrared targets information and more plentiful texture features simultaneously. We conduct the experiments on the two public datasets, and six evaluation indexes are used to test the performance of the presented method. Quantitative results show that the proposed method is superior to nine other methods. Compared with the best results of the nine methods, the E, AG, PSNR, and SPD of this method on the TNO dataset are increased by 7.2%, 12.9%, 5.5%, and 92.3%, respectively; the same four parameters on the RoadScene dataset are increased by 7.4%, 1.5%, 13.5%, and 25.7%. The qualitative results demonstrate that the fused images have better fusion quality, and they are more consistent with the human visual system. The proposed method has a good application in the target detection field, such as in the military, medical diagnosis, target tracking, etc.

Author Contributions

Conceptualization, T.X. and X.X.; methodology, X.X. and C.L. (Cong Luo); software, J.Z.; validation, C.L. (Cheng Liu) and M.Y.; investigation, C.L. (Cheng Liu); writing—original draft preparation, C.L. (Cong Luo); writing—review and editing, C.L. (Cong Luo), J.Z. and X.X.; supervision, T.X.; project administration, X.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Natural Science Foundation of China under [NO: 61805021], and in part by the funds of the Science Technology Department of Jilin Province [NO: 20200401146GX].

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Hu, W.; Hu, H. Discriminant Deep Feature Learning based on joint supervision Loss and Multi-layer Feature Fusion for heterogeneous face recognition. Comput. Vis. Image Underst. 2019, 184, 9–21. [Google Scholar] [CrossRef]
  2. Sun, H.; Liu, Q.; Wang, J.; Ren, J.; Wu, Y.; Zhao, H.; Li, H. Fusion of Infrared and Visible Images for Remote Detection of Low-Altitude Slow-Speed Small Targets. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 2971–2983. [Google Scholar] [CrossRef]
  3. Huang, H.; Dong, L.; Xue, Z.; Liu, X.; Hua, C. Fusion algorithm of visible and infrared image based on anisotropic diffusion and image enhancement. PLoS ONE 2021, 16, e0245563. [Google Scholar]
  4. Jose, J.; Gautam, N.; Tiwari, M.; Tiwari, T.; Suresh, A.; Sundararaj, V.; Mr, R. An image quality enhancement scheme employing adolescent identity search algorithm in the NSST domain for multimodal medical image fusion. Biomed. Signal Proces. Control 2021, 66, 102480. [Google Scholar] [CrossRef]
  5. Jin, X.; Huang, S.; Jiang, Q.; Lee, S.J.; Wu, L.; Yao, S. Semisupervised Remote Sensing Image Fusion Using Multiscale Conditional Generative Adversarial Network with Siamese Structure. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 7066–7084. [Google Scholar] [CrossRef]
  6. Qi, G.; Chang, L.; Luo, Y.; Chen, Y.; Zhu, Z.; Wang, S. A Precise Multi-Exposure Image Fusion Method Based on Low-level Features. Sensors 2020, 20, 1597. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  7. Liu, Y.; Dong, L.; Ji, Y.; Xu, W. Infrared and Visible Image Fusion through Details Preservation. Sensors 2019, 19, 4556. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  8. Zhang, S.; Liu, F. Infrared and visible image fusion based on non-subsampled shearlet transform, regional energy, and co-occurrence filtering. Electron. Lett. 2020, 56, 761–764. [Google Scholar] [CrossRef]
  9. Srivastava, R.; Prakash, O.; Khare, A. Local energy-based multimodal medical image fusion in curvelet domain. IET Comput. Vis. 2016, 10, 513–527. [Google Scholar] [CrossRef]
  10. Liu, X.; Zhou, Y.; Wang, J. Image fusion based on shearlet transform and regional features. AEU-Int. J. Electron. Commun. 2014, 68, 471–477. [Google Scholar] [CrossRef]
  11. Versaci, M.; Calcagno, S.; Morabito, F.C. Fuzzy Geometrical Approach Based on Unit Hyper-Cubes for Image Contrast Enhancement. In Proceedings of the 2015 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), Kuala Lumpur, Malaysia, 19–21 October 2015; pp. 488–493. [Google Scholar]
  12. Zhang, X.F.; Yan, H.; He, H. Multi-focus image fusion based on fractional-order derivative and intuitionistic fuzzy sets. Front. Inf. Technol. Electron. Eng. 2020, 21, 834–843. [Google Scholar] [CrossRef]
  13. Tirupal, T.; Mohan, B.C.; Kumar, S.S. Multimodal medical image fusion based on Sugeno’s intuitionistic fuzzy sets. ETRI J. 2017, 39, 173–180. [Google Scholar] [CrossRef]
  14. Seng, C.H.; Bouzerdoum, A.; Amin, M.G.; Phung, S.L. Probabilistic Fuzzy Image Fusion Approach for Radar Through Wall Sensing. IEEE Trans. Image Process. 2013, 22, 4938–4951. [Google Scholar] [CrossRef] [PubMed]
  15. Ma, J.; Ma, Y.; Li, C. Infrared and visible image fusion methods and applications: A survey. Inform. Fusion 2019, 45, 153–178. [Google Scholar] [CrossRef]
  16. Zhou, Z.H.; Tan, M. Infrared Image and Visible Image Fusion Based on Wavelet Transform. Adv. Mater. Res. 2013, 756–759, 2850–2856. [Google Scholar] [CrossRef]
  17. Kou, L.; Zhang, L.; Zhang, K.; Sun, J.; Han, Q.; Jin, Z. A multi-focus image fusion method via region mosaicking on Laplacian pyramids. PLoS ONE 2018, 13, e0191085. [Google Scholar] [CrossRef] [PubMed]
  18. Guo, L.; Dai, M.; Zhu, M. Multifocus color image fusion based on quaternion curvelet transform. Opt. Express 2012, 20, 18846–18860. [Google Scholar] [CrossRef] [PubMed]
  19. Mosavi, M.R.; Bisjerdi, M.H.; Rezai-Rad, G. Optimal Target-Oriented Fusion of Passive Millimeter Wave Images with Visible Images Based on Contourlet Transform. Wireless Pers. Commun. 2017, 95, 4643–4666. [Google Scholar] [CrossRef]
  20. Adu, J.; Gan, J.; Wang, Y.; Huang, J. Image fusion based on nonsubsampled contourlet transform for infrared and visible light image. Infrared Phys. Technol. 2013, 61, 94–100. [Google Scholar] [CrossRef]
  21. Fan, Z.; Bi, D.; Gao, S.; He, L.; Ding, W. Adaptive enhancement for infrared image using shearlet frame. J. Opt. 2016, 18, 085706. [Google Scholar] [CrossRef]
  22. Guorong, G.; Luping, X.; Dongzhu, F. Multi-focus image fusion based on non-subsampled shearlet transform. IET Image Process. 2013, 7, 633–639. [Google Scholar] [CrossRef]
  23. El-Hoseny, H.M.; El-Rahman, W.A.; El-Shafai, W.; El-Banby, G.M.; El-Rabaie, E.S.M.; Abd El-Samie, F.E.; Faragallah, O.S.; Mahmoud, K.R. Efficient multi-scale non-sub-sampled shearlet fusion system based on modified central force optimization and contrast enhancement. Infrared Phys. Technol. 2019, 102, 102975. [Google Scholar] [CrossRef]
  24. Kong, W.; Wang, B.; Lei, Y. Technique for infrared and visible image fusion based on non-subsampled shearlet transform and spiking cortical model. Infrared Phys. Technol. 2015, 71, 87–98. [Google Scholar] [CrossRef]
  25. Guo, K.; Labate, D. Optimally Sparse Multidimensional Representation Using Shearlets. SIAM J. Math. Anal. 2007, 39, 298–318. [Google Scholar] [CrossRef] [Green Version]
  26. Easley, G.; Labate, D.; Lim, W.Q. Sparse directional image representations using the discrete shearlet transform. Appl. Comput. Harmon. Anal. 2008, 25, 25–46. [Google Scholar] [CrossRef] [Green Version]
  27. Deschrijver, G.; Kerre, E.E. On the relationship between some extensions of fuzzy set theory. Fuzzy Sets Syst. 2003, 133, 227–235. [Google Scholar] [CrossRef]
  28. Bai, X. Infrared and Visual Image Fusion through Fuzzy Measure and Alternating Operators. Sensors 2015, 15, 17149–17167. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  29. Saeedi, J.; Faez, K. Infrared and visible image fusion using fuzzy logic and population-based optimization. Appl. Soft Comput. 2012, 12, 1041–1054. [Google Scholar] [CrossRef]
  30. Pal, S.K.; King, R. Image enhancement using smoothing with fuzzy sets. IEEE Trans. Syst. Man Cybern. 1981, 11, 494–500. [Google Scholar]
  31. Selvaraj, A.; Ganesan, P. Infrared and visible image fusion using multi-scale NSCT and rolling-guidance filter. IET Image Process. 2020, 14, 4210–4219. [Google Scholar] [CrossRef]
  32. Yu, S.; Chen, X. Infrared and Visible Image Fusion Based on a Latent Low-Rank Representation Nested with Multiscale Geometric Transform. IEEE Access 2020, 8, 110214–110226. [Google Scholar] [CrossRef]
  33. Cai, H.; Zhuo, L.; Zhu, P.; Huang, Z.; Wu, X. Fusion of infrared and visible images based on non-subsampled contourlet transform and intuitionistic fuzzy set. Acta Photon. Sin. 2018, 47, 610002. [Google Scholar] [CrossRef]
  34. Dai, Z.; Wang, Q. Research on fusion method of visible and infrared image based on PCNN and IFS. J. Optoelectron. Laser 2020, 31, 738–744. [Google Scholar]
  35. Bavirisetti, D.P.; Xiao, G.; Liu, G. In Multi-Sensor Image Fusion Based on Fourth Order Partial Differential Equations. In Proceedings of the 2017 20th International Conference on Information Fusion (Fusion), Xi’an, China, 10–13 July 2017; pp. 1–9. [Google Scholar]
  36. Ma, J.; Zhou, Z.; Wang, B.; Zong, H. Infrared and visible image fusion based on visual saliency map and weighted least square optimization. Infrared Phys. Technol. 2017, 82, 8–17. [Google Scholar] [CrossRef]
  37. Jingchao, Z.; Suzhen, L.; Dawei, L.; Lifang, W.; Xiaoli, Y. Comparative Study of Intuitionistic Fuzzy Sets in Multi-band Image Fusion. Infrared Technol. 2018, 40, 881. [Google Scholar]
  38. Du, Q.; Xu, H.; Ma, Y.; Huang, J.; Fan, F. Fusing infrared and visible images of different resolutions via total variation model. Sensors 2018, 18, 3827. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  39. Li, H.; Wu, X.J. Infrared and visible image fusion using latent low-rank representation. arXiv 2018, arXiv:1804.08992. [Google Scholar]
  40. Anantrasirichai, N.; Zheng, R.; Selesnick, I.; Achim, A. Image fusion via sparse regularization with non-convex penalties. Pattern Recogn. Lett. 2020, 131, 355–360. [Google Scholar] [CrossRef] [Green Version]
  41. Li, H.; Wu, X.J.; Kittler, J. MDLatLRR: A novel decomposition method for infrared and visible image fusion. IEEE Trans. Image Process. 2020, 29, 4733–4746. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  42. Li, H.; Wu, X.J.; Kittler, J. RFN-Nest: An end-to-end residual fusion network for infrared and visible images. Inf. Fusion 2021, 73, 72–86. [Google Scholar] [CrossRef]
  43. Xing, X.X.; Liu, C.; Luo, C.; Xu, T. Infrared and visible image fusion based on nonlinear enhancement and NSST decomposition. EURASIP J. Wirel. Commun. Netw. 2020, 2020, 162. [Google Scholar] [CrossRef]
  44. Yang, Y.C.; Li, J.; Wang, Y.P. Review of image fusion quality evaluation methods. J. Front. Comput. Sci. Technol. 2018, 12, 1021–1035. [Google Scholar]
  45. Guo, Q.; Liu, S. Performance analysis of multi-spectral and panchromatic image fusion techniques based on two wavelet discrete approaches. Optik 2011, 122, 811–819. [Google Scholar] [CrossRef]
Figure 1. NSST decomposition structure with two levels.
Figure 1. NSST decomposition structure with two levels.
Sensors 21 07813 g001
Figure 2. The overall framework of the proposed algorithm.
Figure 2. The overall framework of the proposed algorithm.
Sensors 21 07813 g002
Figure 3. The fusion framework of the detailed layer.
Figure 3. The fusion framework of the detailed layer.
Sensors 21 07813 g003
Figure 4. The fusion results in TNO Dataset. The first and second rows are the IR and VIS images. From the third to the fifth row are the fusion results of RE-NSST, IFS-NSST and Proposed methods on 4 sets of the source images.
Figure 4. The fusion results in TNO Dataset. The first and second rows are the IR and VIS images. From the third to the fifth row are the fusion results of RE-NSST, IFS-NSST and Proposed methods on 4 sets of the source images.
Sensors 21 07813 g004
Figure 5. Results on TNO dataset. From the first row to the last row are IR images, VIS images, the fusion results of FPDE, VSM, Bala, Gauss, DRTV, LATLRR, SR, MDLatLRR, RFN, and the proposed methods.
Figure 5. Results on TNO dataset. From the first row to the last row are IR images, VIS images, the fusion results of FPDE, VSM, Bala, Gauss, DRTV, LATLRR, SR, MDLatLRR, RFN, and the proposed methods.
Sensors 21 07813 g005aSensors 21 07813 g005b
Figure 6. Comparison results of six evaluation parameters. The nine methods, i.e., FPDE, VSM, Bala, Gauss, DRTV, LATLRR, SR, MDLatLRR, RFN are compared with the proposed method.
Figure 6. Comparison results of six evaluation parameters. The nine methods, i.e., FPDE, VSM, Bala, Gauss, DRTV, LATLRR, SR, MDLatLRR, RFN are compared with the proposed method.
Sensors 21 07813 g006
Figure 7. Fusion results on ‘2_Men in front of house’ image. From (cl) are the fusion result of the ten methods.
Figure 7. Fusion results on ‘2_Men in front of house’ image. From (cl) are the fusion result of the ten methods.
Sensors 21 07813 g007
Figure 8. Results on the RoadScene dataset. From the first row to the last are IR images, VIS images, the fusion results of FPDE, VSM, Bala, Gauss, DRTV, LATLRR, SR, MDLatLRR, RFN, and the proposed method.
Figure 8. Results on the RoadScene dataset. From the first row to the last are IR images, VIS images, the fusion results of FPDE, VSM, Bala, Gauss, DRTV, LATLRR, SR, MDLatLRR, RFN, and the proposed method.
Sensors 21 07813 g008aSensors 21 07813 g008b
Figure 9. Quantitative results of six evaluation parameters. The nine methods, i.e., FPDE, VSM, Bala, Gauss, DRTV, LATLRR, SR, MDLatLRR, RFN are compared with the proposed method.
Figure 9. Quantitative results of six evaluation parameters. The nine methods, i.e., FPDE, VSM, Bala, Gauss, DRTV, LATLRR, SR, MDLatLRR, RFN are compared with the proposed method.
Sensors 21 07813 g009aSensors 21 07813 g009b
Table 1. Quantitative comparison results of RE-NSST, IFS-NSST, and proposed methods.
Table 1. Quantitative comparison results of RE-NSST, IFS-NSST, and proposed methods.
PicturesAlgorithmEAGMICESPDPSNR
set2RE-NSST7.02317.83911.54910.400320.213718.6581
IFS-NSST7.49858.28661.67440.414317.966717.4212
Proposed7.50038.11582.14630.38819.123921.3318
set4RE-NSST6.65026.61271.38851.158327.056116.3921
IFS-NSST7.19606.98171.56730.440123.069716.2754
Proposed7.20656.87241.73950.267816.675117.9727
set5RE-NSST7.13675.03692.14080.849930.367814.9696
IFS-NSST7.41935.21692.13550.706822.506215.6012
Proposed7.53175.22232.22870.443817.107016.7450
set6RE-NSST6.71525.12512.38172.512530.720017.3706
IFS-NSST7.16575.65242.71811.454523.079514.9469
Proposed7.17645.28183.09241.218510.699822.1216
Table 2. Quantitative comparison results of the ten methods on ‘2_Men in front of house’ image.
Table 2. Quantitative comparison results of the ten methods on ‘2_Men in front of house’ image.
AlgorithmEAGMICESPDPSNR
FPDE6.63855.19611.53341.157325.856817.8237
VSM6.53745.04311.13451.527430.384615.9714
Bala6.75152.50251.38970.655628.134216.8945
Gauss6.75733.34461.40761.587427.172517.3515
DRTV7.07674.96481.92730.852160.366710.0039
LATLRR6.64683.30421.15251.265831.878315.9835
SR6.66103.43511.77601.676827.708317.1956
MDLatLRR6.69133.92601.89460.4422134.42245.3021
RFN6.64242.72651.19561.4660 30.497115.5273
Proposed7.13225.64272.06280.245514.546119.2055
Table 3. Running time T (s) of the ten methods on the TNO dataset.
Table 3. Running time T (s) of the ten methods on the TNO dataset.
ImagesFPDEVSMBalaGaussDRTVLATLRRSRMDLatLRRRFNProposed
set111.02812.104832.558733.21930.8448105.98466.0897150.645810.63173.1674
set218.68523.615649.119449.35991.3805111.304610.266186.339811.41004.8280
set310.29542.259931.494131.01600.817299.63535.9340180.670711.96723.1971
set41.76410.808022.360819.62830.251733.68491.716880.059612.73501.3497
set510.82912.104632.451832.03410.8023106.7676.3531161.459313.62483.1962
Table 4. Running time T (s) of the ten methods on the RoadScene dataset.
Table 4. Running time T (s) of the ten methods on the RoadScene dataset.
ImagesFPDEVSMBalaGaussDRTVLATLRRSRMDLatLRRRFNProposed
set15.34122.097736.040037.96510.4194104.18143.1564153.41539.76782.3402
set22.61833.021617.085519.68530.296275.00361.731895.9721110.84581.5268
set37.62244.600624.344625.76510.5192112.27963.1698192.094211.77441.9909
set42.70655.310914.097515.70180.246055.05711.537076.5312412.14731.2707
set56.12026.846623.580625.09310.4660103.07103.0122162.677113.03351.9509
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Xing, X.; Luo, C.; Zhou, J.; Yan, M.; Liu, C.; Xu, T. Combining Regional Energy and Intuitionistic Fuzzy Sets for Infrared and Visible Image Fusion. Sensors 2021, 21, 7813. https://doi.org/10.3390/s21237813

AMA Style

Xing X, Luo C, Zhou J, Yan M, Liu C, Xu T. Combining Regional Energy and Intuitionistic Fuzzy Sets for Infrared and Visible Image Fusion. Sensors. 2021; 21(23):7813. https://doi.org/10.3390/s21237813

Chicago/Turabian Style

Xing, Xiaoxue, Cong Luo, Jian Zhou, Minghan Yan, Cheng Liu, and Tingfa Xu. 2021. "Combining Regional Energy and Intuitionistic Fuzzy Sets for Infrared and Visible Image Fusion" Sensors 21, no. 23: 7813. https://doi.org/10.3390/s21237813

APA Style

Xing, X., Luo, C., Zhou, J., Yan, M., Liu, C., & Xu, T. (2021). Combining Regional Energy and Intuitionistic Fuzzy Sets for Infrared and Visible Image Fusion. Sensors, 21(23), 7813. https://doi.org/10.3390/s21237813

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop