Next Article in Journal
Targeting Transcutaneous Spinal Cord Stimulation Using a Supervised Machine Learning Approach Based on Mechanomyography
Next Article in Special Issue
The Effectiveness of an Adaptive Method to Analyse the Transition between Tumour and Peritumour for Answering Two Clinical Questions in Cancer Imaging
Previous Article in Journal
Research on Residual-Current Measurement System of Substation Considering Magnetic Shielding Effect
Previous Article in Special Issue
Quasi Real-Time Apple Defect Segmentation Using Deep Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Decomposed Multilateral Filtering for Accelerating Filtering with Multiple Guidance Images

1
Department of Computer Science, Faculty of Engineering, Nagoya Institute of Technology, Gokiso-cho, Showa-ku, Nagoya 466-8555, Japan
2
Department of Electrical Engineering, Faculty of Engineering, Tokyo University of Science, Tokyo 125-8585, Japan
*
Author to whom correspondence should be addressed.
Sensors 2024, 24(2), 633; https://doi.org/10.3390/s24020633
Submission received: 13 November 2023 / Revised: 8 January 2024 / Accepted: 17 January 2024 / Published: 19 January 2024
(This article belongs to the Special Issue Feature Papers in "Sensing and Imaging" Section 2023)

Abstract

:
This paper proposes an efficient algorithm for edge-preserving filtering with multiple guidance images, so-called multilateral filtering. Multimodal signal processing for sensor fusion is increasingly important in image sensing. Edge-preserving filtering is available for various sensor fusion applications, such as estimating scene properties and refining inverse-rendered images. The main application is joint edge-preserving filtering, which can preferably reflect the edge information of a guidance image from an additional sensor. The drawback of edge-preserving filtering lies in its long computational time; thus, many acceleration methods have been proposed. However, most accelerated filtering cannot handle multiple guidance information well, although the multiple guidance information provides us with various benefits. Therefore, we extend the efficient edge-preserving filters so that they can use additional multiple guidance images. Our algorithm, named decomposes multilateral filtering (DMF), can extend the efficient filtering methods to the multilateral filtering method, which decomposes the filter into a set of constant-time filtering. Experimental results show that our algorithm performs efficiently and is sufficient for various applications.

1. Introduction

Multimodal signal processing for sensor fusion is increasingly important in image sensing. Sensor fusion can combine beneficial information from different sensors to generate a richer single image. Image signal fusion approaches have various applications: RGB and infrared image fusion [1,2,3], RGB and multispectral image fusion [4], intercolor RGB signal fusion [5,6], RGB and depth fusion [7,8], RGB and light fusion [9], RGB and computed edge fusion [10], different focal image fusion [11,12], CT and MRI signal fusion for medical image processing [3], Retinex-based enhancement [13], SAR and multispectral image fusion [14], and general signal fusion [15].
Filtering is a basic tool for handling such multimodal signals. Multilateral filtering, which is one type of edge-preserving filtering, successfully handles multiple signal information. Edge-preserving filtering with additional guidance information, called joint edge-preserving filtering, recently attracted attention from image processing and computational photography researchers for sensor fusion. Joint edge-preserving filtering helps transfer major characteristics from guidance images, which are not filtering images themselves. Various applications use the filters, including flash/no-flash photography [16,17], up-sampling/super resolution [18], compression noise removal [19], alpha matting [20], haze removing [21], rain removing [22], depth refinement [23,24], stereo matching [25,26], and optical flow estimation [27].
Joint/cross bilateral filtering [16,17] is a seminal work of joint edge-preserving filtering. The filter is naturally derived from bilateral filtering [28] by computing the kernel weight from a guidance image instead of an input filtering image. This formulation enables us to reflect the edge information of the guidance image (e.g., RGB, infrared, and hyperspectral images) to the filtering target image (e.g., RGB image, alpha mask, depth map, and optical flow).
We can expect a higher edge-preserving effect using multiple guidance images (e.g., a set of multi-sensor signals), and recently, we have been able to capture not only RGB images but also infrared, hyperspectral, depth, and other images by new devices (e.g., infrared/hyperspectral cameras and depth sensors). The images have different edge information and signal characteristics from RGB images. The multiple guidance information is helpful for improving signal visibility and the signal-to-noise ratio [29,30]. In other cases, we can deal with a self-generated image and inversely rendered maps as an additional guidance signal [31,32].
There are two categories for using multiple guidance images in image filtering: high-dimensional filtering [30,33,34,35,36,37] and multilateral filtering [29,31,32,38,39]. The former is additive logic, and the latter is multiplicative logic for additional kernels. The main difference is the severity of the restriction to compute the kernel weight. The restriction of additive logic is looser than that of multiplicative logic; hence, high-dimensional filtering can robustly smooth out noise or rich textures. By contrast, since the restriction of the multiplicative logic is severe, multilateral filtering produces fewer blurred regions. Each filtering method has advantages and disadvantages, but multilateral filtering is preferred when we expect a sharply edge-preserving effect.
A critical issue of edge-preserving filtering for multimodel sensing is computational time. This is because sensing is the gateway to all processing, and signal processing during sensing is expected to operate in real time. Therefore, many researchers have proposed acceleration methods for edge-preserving filtering. In particular, the acceleration for bilateral filtering has been actively discussed. The bilateral grid [40,41] is the seminal approach, and Yang et al. [42,43] extend it to bilateral filtering in constant time. Yang’s method [43] has adequate efficiency in grayscale images, and recent work further accelerates the bilateral filter [44,45,46]; however, they are inefficient in color cases. There are several proposals [34,47,48] to approximate and accelerate bilateral filtering in the case of higher-dimensional (color) images. Furthermore, hardware-friendly methods are proposed [49,50,51]. However, these approaches have limitations for kernel weight, whose kernels are defined by the Gaussian distribution. Other efficient edge-preserving filters, which do not limit the Gaussian distribution, are proposed in contrast to the bilateral filtering acceleration. Guided image filtering [20], domain transform filtering [33], and adaptive manifold filtering [30] are representative examples. These filters have assumptions different from Gaussian smoothing but have excellent edge-preserving effects and efficiency. Note that these filters can handle similar signals better than those with different modalities and characteristics.
Multiple guidance images provide richer information for various applications; however, these efficient methods cannot individually handle multiple guidance images. Therefore, we propose an efficient algorithm for accelerating multilateral filtering, which is developed for multiple-guidance image filtering. Furthermore, we extend the efficient edge-preserving filters so that they can exploit multiple guidance images.
Our algorithm is based on the fact that n-lateral filtering is represented by the summation of ( n 1 )-lateral filtering. Therefore, when multilateral filtering is expanded as an asymptotic expression, it becomes constant-time filtering since 1-lateral filtering is spatial filtering. Figure 1 denotes the overview of the proposed filter algorithm. The proposed filter—named DMF: decomposed multilateral filtering—recursively decomposes multilateral (n-lateral) filtering by splatting to ( n 1 )-lateral filtering until it is a constant-time filter. Then, the results of constant-time filtering for the decomposed components are merged into the result of multilateral filtering.
The contributions of this paper are summarized as follows:
1.
Introducing a constant-time algorithm for multilateral filtering (Section 5);
2.
Extending various filters (e.g., guided image filtering [20] and domain transform filtering [33]) to deal with multiple guidance information (Section 6.1);
3.
Proposing a multilateral extension to the filter that uses the filtering output as a guidance image, such as rolling guidance filters [52] (Section 6.2).

2. Related Work

Due to physical constraints, a single image sensor cannot simultaneously capture rich information such as resolution, wavelength range, focus, dynamic range, and scene features. Image fusion is one way to solve this problem. Research on image fusion is active, with the number of papers increasing each year, as well as many survey papers [53,54,55,56,57,58,59,60,61,62]. Image fusion involves smoothing, denoising, enhancement, sharpening, super-resolution, and blending for multiple signals to obtain the desired signal. Image fusion is mainly divided into digital photography image fusion and multimodal image fusion.
Digital photography image fusion combines images taken by the same sensor with different sensor settings and includes multi-focus image fusion, multi-exposure image fusion, multi-temporal image fusion, and multi-view image fusion. In multi-focus image fusion, an all-in-focus image is synthesized from images taken at different focus settings, and in multi-exposure image fusion, a wide dynamic range image is synthesized from images taken at different dynamic ranges. Multi-exposure image fusion also includes the use of different external flash environments. Multi-temporal image fusion synthesizes signals that vary along a time axis, while multi-view image fusion synthesizes signals from camera motion or multiple cameras capturing a scene.
Multimodal image fusion combines different characteristics of multiple sensors into one, including RGB-IR fusion, multi-hyperspectral-panchromatic image fusion, RGB-depth/LiDAR fusion, and medical image fusion (CT, PET, MRI, SPECT, X-ray), etc. In RGB-IR fusion, visible images are combined with IR images, taking advantage of the high contrast of IR and the good texture characteristics of RGB in the visible region. It also combines images using the different wavelength bands that can be captured by external flashes. In multi-hyperspectral-panchromatic image fusion, sensors that acquire images in different wavelength bands and resolutions are combined, and each sensor often has a different resolution and noise sensitivity. The objective is to improve the resolution and noise sensitivity of each sensor. RGB-depth/LiDAR fusion corrects depth sensor output from RGB images, including upsampling of depth information, missing interpolation, and contour correction noise reduction. Medical image fusion integrates the output of various medical sensors in the same dimension to assist in the diagnosis.
Among these image fusion methods, those that improve the acquisition signal are called sharpening fusion, which aims at signal denoising, sharpening, contrast improvement, and resolution improvement. In image fusion, various tools are used, such as weighted smoothing filtering, morphology filtering, principal component analysis (PCA), Laplacian pyramid, discrete cosine transformation (DCT), discrete Fourier transformation (DFT), discrete wavelet transform (DWT), etc. This paper is an extension of the weighted smoothing method. In particular, the proposed method extends existing smoothing/weighted smoothing methods to guided smoothing and has a wide range of applications.

3. Preliminaries

In this section, we review the previous work of constant-time bilateral filtering proposed by Yang et al. [42,43]. Bilateral filtering [28] is a representative edge-preserving smoothing filtering defined as a finite impulse response (FIR) manner. This filtering achieves edge-preserving effects by filtering in the range and spatial domains; thus, its filtering kernel weights are derived from a product of spatial and range weights based on a Gaussian distribution. Let input and output images be denoted as I , O B : S R , where S Z D is the spatial domain, R = [ 0 , n 1 ] d R d is the range domain, and d is the color range dimension (generally, D = 2 , N = 256 , and d = 3 ), respectively. Bilateral filtering is formulated as follows:
O p B = q N p f S ( p , q ) f R ( I p , I q ) I q q N p f S ( p , q ) f R ( I p , I q ) ,
where p , q S represents a target pixel and a neighboring pixel of p , respectively. I p , I q R are pixel values at p , q . N p S is a set of neighboring pixels of p . f S : S × S R , f R : R × R R are weight functions based on the Gaussian distribution whose smoothing parameters are σ S and σ R , respectively. Here, we can formulate joint bilateral filtering [16,17] by replacing I in (1) with an arbitrary additional guidance image J : S R .
Naïve bilateral filtering is O ( r 2 ) per pixel algorithm, where r is the filtering kernel radius; thus, the computational complexity increases exponentially when the kernel size is large. Several constant-time-per-pixel algorithms for bilateral filtering have been proposed to solve this problem. In particular, the algorithm proposed by Yang et al. [42,43] is the basis of the proposed method.
Yang et al. proposed a constant-time algorithm by extending the bilateral grid [40,41]. The algorithm decomposes bilateral filtering into a set of spatial filtering that can be computed in constant-time (e.g., box filter using integral image [63,64] and the recursive Gaussian filter [65,66,67,68]). The decomposition is conducted by computing principle bilateral filtered image components (PBFICs) [43] from the input or guidance image. Since arbitrary range filtering weights can generate PBFICs, the algorithm can compute the arbitrary bilateral filtering response in the range kernel.
Yang’s algorithm [43] is further extended to apply to multichannel images in [42]. The extended algorithm computes multichannel images by preparing multichannel PBFICs with combinations of pixel values in each channel. However, this extension requires uniform processing for all channels. In other words, we cannot filter for each channel differently. This indicates that the algorithm is extendable when we compute multichannel or multiple guidance images with differential characteristics in each channel.
Our algorithm is inspired by Yang’s algorithm [42,43], which represents bilateral filtering by a set of spatial filtering. In contrast, our algorithm decomposes a filter for multichannel images into arbitrary constant-time filters.

4. Relationship between Multilateral Filtering and Higher-Dimensional Filtering

In this section, we compare the filtering properties between multilateral filtering (MF) and high-dimensional filtering (HDF). The main difference between them is the logic to compute the filtering weight. The weight of multilateral filtering f M R is computed by the multiplicative logic from spatial weight and range weights of multiple guidance images:
f M ( p , q ) = f S ( p , q ) i = 1 m f R i ( J p i , J q i ) ,
where f R i : R i × R i R is a filtering weight for the i-th guidance image J i : S R i , where R i is the range domain of J i . m is the number of guidance images. An early work on MF was proposed by Choudhury and Tumblin [32]. Each range weight f R i for the guidance image is individually defined to represent the characteristics of the image.
HDF’s weight  f H R is computed by the additive logic:
f H ( p , q ) = ρ i = 1 | V p | l γ ( V p ( i ) V p ( i ) ) ,
where ρ R denotes an arbitrary weight function at the pixel p ; l γ R denotes an arbitrary norm function; V p denotes higher-dimensional information consisting of spatial and range information, e.g., V p = ( x p , y p , r p , g p , b p ) in RGB image, and | V p | Z is the size of V p . The work of Gastal and Oliveira [30] is a successful extension for HDF with multiple guidance information. They exploited additional guidance information to increase higher-dimensional information V .
The two logics differ in terms of the severity of the restriction to compute the kernel weight; the multiplicative logic’s restriction is more severe than the additive logic. The difference affects the edge-preserving performance. Figure 2 shows examples of HDF and MF weights. HDF assigns the low weights as a whole, even if the guidance pixel is hardly relevant to the target pixel. In contrast, MF assigns the low weights with the guidance pixel having a similar target pixel value. In this way, MF has a high edge preservation effect; hence, it is preferred when it is significant.

5. Proposed Method: Decomposed Multilateral Filtering

The proposed filter of DMF first decomposes MF until constant-time filtering. This allows us to convert the computational complexity from O ( r 2 ) to O ( 1 ) per pixel. This section defines MF and proves its decomposability. Algorithm 1 reviews the flow of DMF. Next, we discuss the extension of the algorithm.

5.1. Definition of Multilateral Filtering

MF assumes that the filtering weight is derived from the multiplicative logic discussed in Section 4. Furthermore, MF is equivalent to n-lateral filtering when n 1 guidance images are used for filtering. When n = 1  or 2, n-lateral filtering means spatial filtering or bilateral filtering, respectively. Therefore, we assume n 3 in this section and compute n-lateral filtering output O n : S R as follows:
O p n = q N p f n ( p , q ) I q q N p f n ( p , q ) ,
f n ( p , q ) = f S ( p , q ) i = 1 n 1 f R i ( J p i , J q i ) ( n 2 ) ,
f 1 ( p , q ) = f S ( p , q ) ( n = 1 ) ,
where f R i denotes the range filtering weight for i-th guidance image J i . The n-lateral filtering weight is f n R . Equations (4) and (5) are the basic formulation of MF. Here, we basically define the first filter of f 1 as a spatial filter in (6), which is an arbitrary linear-time invariant (LTI) filter (e.g., box, circler, Gaussian, and Gabor filters) and linear-time variant (LTV) filters (e.g., spatially adaptive Gaussian filter [69]). LTI filters can be performed in O ( 1 ) by sliding DCT [68] filtering, but adaptive filtering has difficulty in converting O ( 1 ) filters.
Algorithm 1 Decomposed Multilateral Filtering
  • function n-lateral_filtering( n , J , I )
  •     // J = { J 1 , J 2 , , J n 1 }
  •     for all values k ( 0 k T n 1 1 )  do
  •         // Splatting as Equations (11) and (12)
  •          W L k n 1 f R n 1 ( k , J n 1 ) I
  •          K L k n 1 f R n 1 ( k , J n 1 )
  •         // ( n 1 ) -lateral filtering
  •         if  n 2  then
  •             W L k n 1 n-lateral_filtering( n 1 , J , W L k n 1 )
  •             K L k n 1 n-lateral_filtering( n 1 , J , K L k n 1 )
  •         else
  •            // Final filtering step
  •             W L k n 1 f S W L k n 1
  •             K L k n 1 f S K L k n 1
  •         end if
  •         // Normalization as Equation (13)
  •          C L k n 1 K L k n 1 / W L k n 1
  •     end for
  •     // Interpolation as Equation (15)
  •     // L n 1 = { L 1 n 1 , , L T n 1 1 n 1 }
  •      O n ← Interpolation( J n 1 , L n 1 , C L n 1 )
  •     return  O n
  • end function

5.2. Recursive Representation for Decomposed Multilateral Filtering

We introduce DMF and prove the decomposability of MF in this subsection. In (5), the n-lateral filtering weight f n can be replaced with the product of its one-dimensional lower weight f n 1 and the range weight f R n 1 :
f n ( p , q ) = f n 1 ( p , q ) f R n 1 ( J p n 1 , J q n 1 ) .
We can re-formulate MF from (4) using (7) as follows:
O p n = q N p f n 1 ( p , q ) f R n 1 ( J p n 1 , J q n 1 ) I q q N p f n 1 ( p , q ) f R n 1 ( J p n 1 , J q n 1 ) .
This form shows that we can express n-lateral filtering using ( n 1 ) -lateral filtering weight. Furthermore, we deform (8) by the additional assumption that the pixel values of the guidance images are discrete. Let c R n 1 = [ 0 , N n 1 1 ] be a constant value, where N n 1 is the number of tones in the range of the ( n 1 ) -th guidance image. When a pixel value in the n-th guidance image J p n 1 in (8) is replaced by a constant value c, it is rewritten as follows:
C c , p n 1 = q N p f n 1 ( p , q ) f R n 1 ( c , J q n 1 ) I q q N p f n 1 ( p , q ) f R n 1 ( c , J q n 1 )
O p n = C v , p n 1 s . t . v = arg min x x J n 1 1 ,
where · 1 is l 1 norm operator. We call C c n 1 : S R a component image of n-lateral filtering, and its pixel value at p is denoted by C c , p n 1 R .
f R n 1 ( c , J q n 1 ) and f R n 1 ( c , J q n 1 ) I q in (9) can be cached as images in constant-time; hence, we express these coefficients as the following images:
W c , q n 1 = f R n 1 ( c , J q n 1 )
K c , q n 1 = W c , q n 1 I q ,
where W c n 1 : S R n 1 and K c n 1 : S R are the elements of the denominator and numerator in (9), respectively. We call the processes for Equations (11) and (12) as splatting followed by the paper [47,48], and we call the images as coefficient image. For simplification, we rewrite (9) using these coefficient images and the convolution operator ∗:
C c n 1 = f n 1 K c n 1 f n 1 W c n 1 ,
where the pixel operator p can be dropped. Figure 3 shows the splatting procedure in DMF.
The denominator and numerator in (13) represent ( n 1 ) -lateral filtering. This indicates that n-lateral filtering has been decomposed into ( n 1 ) -lateral filtering. Therefore, MF can be decomposed recursively:
f n = g n 1 f n 1 = g n 1 g n 2 g 1 f 1 ,
where g x denotes a decomposing operator, as described in Equations (11)–(13). Equation (14) summarizes the DMF formulation. Since f 1 (e.g., Gaussian filtering) can be computed in constant-time per pixel by recursive algorithms, and the decomposing operation is independent of kernel size, DMF can also be computed in constant-time.

5.3. Tonal Range Subsampling

The exact filtering result can be obtained by computing coefficient images for all values c R n 1 ranged in the guidance image J n 1 . Here, we increase efficiency by quantizing the guidance tonal ranges. Let L n 1 be a quantized set of R n 1 , and T n 1 = | L n 1 | be the number of tones in a quantized tonal range of the ( n 1 ) -guidance image, where T n 1 | R n 1 | . Furthermore, let L k n 1 L n 1 be the k-th label’s value, where k { 0 , 1 , , | T n 1 | } that the return value is in the quantized range domain. We can obtain the final output of DMF by linear interpolation of the current and next coefficient images (i.e., C L k n 1 n 1 and C L k + 1 n 1 n 1 ):
O p n ( L k + 1 n 1 J p n 1 ) C L k n 1 , p n 1 + ( J p n 1 L k n 1 ) C L k + 1 n 1 , p n 1
s . t . k = arg min x L n 1 J p n 1 L x n 1 1 .

5.4. Spatial Domain Subsampling

Considering the sparsity of the coefficient images, we can also apply subsampling in the spatial domain for further increasing efficiency, as discussed in [41,42]. In the DMF case, we can apply spatial subsampling to the coefficient images in several steps: the first and the second decomposition. If we apply spatial subsampling to DMF in the n-lateral filtering splatting, the process is computed as:
K c n 1 = downsample ( K c n 1 )
W c n 1 = downsample ( W c n 1 )
C c n 1 = f n 1 K c n 1 f n 1 W c n 1
C c n 1 C c n 1 = upsample ( C c n 1 ) ,
where X and X ( X = { K , W , C } ) are the downsampled and upsampled images, respectively. We use the average nearest-neighbor pixels and linear interpolation for subsampling and upsampling, respectively. Our method can apply different ratios of spatial subsampling to arbitrary guidance channels based on the sparsity of each channel (e.g., YUV image components of JPEG and MPEG format, RGB-D images). The flexibility is an advantage for Yang’s algorithm [42].

6. Extension of Decomposed Multilateral Filtering

6.1. Beyond Gauss Transform

DMF can deal with any multilateral filtering responses and does not limit the Gauss transform [70], which is the combination of Gaussian filtering. DMF can select arbitrary ranges and spatial filters. Furthermore, we can select the filtering responses by changing the final filtering step.
We should have DMF until the spatial filtering in (6). In contrast, DMF does not always require decomposition until it is spatial filtering. Specifically, we can apply any joint edge-preserving filters for the final filtering step while decreasing the number of decompositions. Some edge-preserving filtering can handle multichannel signals in the designed weight function. For example, high-dimensional Gaussian filtering handles multichannel signals by the Gaussian distribution with the Euclid distance; instead, domain transform filtering uses the geodesic distance. Guided image filtering handles them by the local linear model with l 2 norm between signals.
Therefore, when the final filter is performed in edge-preserving filtering, the DMF decomposition can be reduced by the number of dimensions handled by the edge-preserving filtering. Let the handling signal set be G = { J s , , J 1 } , where s is the number of handling channels (i.e., s = 3 in the RGB image case). Using edge-preserving filtering, (13) of the final step is replaced as:
C c s = H G K c s H G W c s ,
where H G represents any joint edge-preserving filtering with the guidance signal set as G . Examples of the final step filtering are high-dimensional filtering (high-dimensional Gaussian filtering [47,48], guided image filtering [20,71], domain transform filtering [33], adaptive manifold filtering [30]), frequency transform filtering ( edge-avoiding wavelet [72,73], redundant frequency transform [74]), adaptive filtering (range parameter adaptive filtering [75]), enhancement filtering (local Laplacian filtering [76,77,78]), statical filtering (fast guided median filtering [79]), LUT-based filtering [80], optimization-based filtering (weighted least square optimization [81,82], and L0 smoothing optimization [83]).
The representation allows us to extend various edge-preserving filtering methods to handle multiple guidance images. This fact is helpful for various applications since the required filtering properties, e.g., local linearity [20] and geodesic distance [33], are different by application. Note that it has the potential to be faster because of the merged treatment of dimensions, but the filter may not work well if the characteristics of the combined set are not identical.

6.2. Multilateral Rolling Guidance Filtering

MF is also helpful in self-generating multiple guidance information from single guidance information, such as rolling guidance filtering [52]. Rolling guidance filtering is iteratively processed using the filtered image as the guidance image. In this regard, the filtering image is fixed as the input image and the guidance image varies. This iterative representation can be applied to multilateral filtering with some modifications. We call it multilateral rolling guidance filtering (MRGF) and show the actual processing in Figure 4. The main difference is the filter output as an additional guidance image.
MRGF is specifically practical when edge information is essential, such as image segmentation and feathering. Since we can reflect the smoothed or refined results to the filtering target image, it tends to refine the desirable features for the target image. Significantly, the first estimated maps of scene properties often contain noises and errors. MRGF has good performance in the refinement of maps. We verify the performance of MRGF in the following experimental part.

7. Experimental Results

We evaluated the proposed filter of DMF in terms of accuracy and efficiency. The implementations for DMF and competitive methods are written in C++, and the codes are parallelized by OpenMP and vectorized by AVX2. We used Intel Core i7 7700K (four cores, eight threads) and Visual Studio2022 compiler for experiments.

7.1. Accuracy Evaluation

First, we evaluated how much the DMF result corresponds to the naïve implementation of the MF result. In our experiments, we applied our algorithm to trilateral filtering [29]. Note that the Gauss transform is applied to the trilateral filtering weights for spatial and range weights, where the standard deviations are σ s , σ r 1 , and σ r 2 . Here, σ r 1 and σ r 2 are the parameters for the tonal ranges of the guidance and filtering images, respectively. We apply recursive Gaussian filtering with sliding DCT [68,84] for spatial filtering. We evaluate the accuracy of our algorithm by flash/no-flash denoising [16,17]. The range kernels f R 1 and f R 2 for MF are computed from flash images and no-flash images, respectively. We use the peak signal-to-noise ratio (PSNR) [85,86] as the objective evaluation method of the approximation accuracy between naïve results and the approximation results. The evaluation formula used is as follows:
P S N R = 10 log 10 S · 255 2 A G 2 2 ,
where · 2 is the l 2 norm operator, A is approximated signals proceded by DMF, and G is ground truth signals produced by naïve MF. S is the number of the elements in the signal A and G, S = | A | = | G | , where | · | returns the number of vector elements.
Figure 5 shows the results of the filtering accuracy in terms of the number of coefficient images. Note that T 2 n d and T 3 r d are the tonal-quantized numbers of coefficient images for the flash and no-flash images, respectively. The filtering accuracy in each case is high overall. For this result, eight coefficient images are enough. This trend is also the same in spatial subsampling, and spatial subsampling is practical because the PSNR accuracy is over 45 dB. It can be seen that the PSNR degradation is more significant when downsampling at the first decomposition. As shown in the next section, downsampling at the first stage has a greater speedup effect; thus, it is up to the application to decide which one to choose.
Figure 6 shows the filtering accuracy of the smoothing parameters. Although the accuracy of DMF varies depending on the parameters, ours has a high accuracy. We can see that it is not very sensitive to changes in spatial parameters σ s . Each of the two guides has similar range parameters σ r 1 and σ r 2 . The smaller the range parameter, the lower the approximation accuracy tends to be, and the effect is more pronounced when the number of decompositions is small. Initial subsampling is also susceptible to this effect. However, the proposed method has an approximation accuracy that is generally better than 45 dB for all parameters, which is sufficient because it becomes difficult for a person to distinguish between two images at around 40 dB [87].

7.2. Efficiency Evaluation

We compare the computational time in two cases for efficiency evaluation. One is a comparison between naïve MF and DMF combined with some edge-preserving filters. Another is a comparison with/without subsampling. In this experiment, we apply real-time bilateral filtering (RTBF) [42,43], guided image filtering (GIF) [20] and domain transform filtering (DTF) [33] to (20). Note that we call DMF with these filters DMF-Gauss, DMF-GF, and DMF-DTF, respectively. Since RTBF can be interpreted as DMF with one guide image for Gaussian filtering, this is equivalent to DMF with two guide images for Gaussian filtering. Therefore, we refer to it as DMF-Gauss. For this experiment, flash and no-flash images were converted to grayscale, and the RGB no-flash image was filtered. In DMF-Gauss, the two-channel images were used as guides; in DMF-DFT and DMF-GIF, the no-flash images were used as guides for DMF, and the flash images were used as guides for DTF and GIF. Note that joint filtering is available for DTF and GIF. Cache-efficient filtering was computed using a one-pass version of Gaussian filtering for DMF-Gauss and box filtering for GIF [64].
Figure 7 shows the processing time results. The processing time of the naïve ML-Guass increases exponentially as the filtering kernel size increases, whereas DMF can be computed in constant-time from Figure 7a. DMF is especially efficient when we use GIF or DTF for the filtering step described in Section 6.1. Furthermore, DMF becomes more efficient by subsampling the spatial domain as shown in Figure 7b. Since DMF and GIF are not decomposable by the proposed method, only a one-step decomposition is possible. Therefore, the computation time for the second decomposition subsampling has not been reported.

7.3. Denoising Performance Evaluation

Here, we evaluate the denoising performance of the proposed method; note that it is not the approximation accuracy evaluated by Section 7.1. In our experiments, we used RGB-IR images and simulated RGB-IR fusion by adding noise to the RGB images. Performance is evaluated in terms of PSNR for the noiseless RGB image and the de-noised image; the IR image is not evaluated in terms of PSNR because it is not noiseless and is not the final visible image.
The comparison methods are redundant DCT denoising (DCT) [74], domain transform filtering (DTF) [33], guided image filtering (GIF) [20], cross-field joint image restoration (CFJIR) [88] and high-dimensional kernel filtering (HDKF) [37]. DCT, DTF, and GIF are extended by the proposed method to handle an additional guidance IR image, named DMF-DCT, DMF-DTF, and DMF-GIF. These methods were chosen for their high-speed performance. CGJIR and HDKF already use the characteristics of the guide image; thus, the proposed method extension is ineffective. For evaluation images, we used the RGB-IR dataset, which includes ten images [37] (https://norishigefukushima.github.io/TilingPCA4CHDGF/ (accessed on 16 January 2024)).
Table 1, Table 2 and Table 3 show PSNR results for each method in different noise levels. It can be seen that the classical method, DCT, has the best performance on average for all noise levels due to the DMF extension. CGJIR and HDKF are new dedicated methods for RGB-IR denoising, and performance comparable to these methods has been achieved by extending this method. All DMF extensions also show a steady improvement in performance.

7.4. Channel Perfomance Evaluation

Here, we evaluated the denoising effect of the number of channels for flash and no-flash images. In Section 7.1, guide images are grayscaled 2-channel, but here, we use two RGB images, 6-channel. In addition, the number of channels is controlled by using PCA dimensionality compression for guide images [37,89]. Note that the denoising performance is different from the approximation performance. We used a flash/no-flash image dataset [37], which contains ten images. Images are filtered by multilateral filtering. In addition to PSNR, we used structural similarity (SSIM) [90], which is a more robust quality metric. Noise was only added in no-flash images.
Table 4 and Table 5 show the results for each metric. On average, the optimal value is taken by four channels in every metric. The SSIM, which is said to have a high human subjective evaluation value, shows that the value is high enough even for two channels.
Table 6 shows the computational time. The computation time increases exponentially with the number of channels. This indicates that we are suffering from the curse of dimensionality. Therefore, it is better to have as small a number of channels as possible.

7.5. Memory Usage Analysis

The memory requirement has linear relations in the number of pixels N p . The number of tones N t has exponential relations in the number of channels N c and multiple guidance images N j . Consequently, the amount of memory required is O ( N p N t N c N j ) , according to Algorithm 1.
The vast memory requirement is one of the limitations, and the limitation is inherited from previous work [42,43]. However, tonal range and spatial domain subsampling can moderate the memory requirement. We can also make memory requirements independent of the number of channels by processing DMF, as discussed in [42,43]. The implementation, however, loses parallelizability and computational efficiency somewhat.

8. Multilateral Filtering for Computational Photography

We verify the effectiveness of MF and DMF by applying several applications of sensor fusion in computational photography.

8.1. Flash/No-Flash Denoising

Flash/no-flash denoising [16,17] is the representative application for edge-preserving filtering with multiple guidance images. Flashing sometimes causes false edges (e.g., appearance/disappearance of shadow edges), as shown in Figure 8a. Joint bilateral filtering can remove noise in the no-flash image, but it simultaneously preserves the false edges of the flash image (Figure 8c). The conventional method requires multiple steps [17] to solve the problem, while MF requires only one step. As shown in Figure 8d, MF can remove noise while preventing false edges from being transferred. Note that we used the value information in the HSV color space of the no-flash image and the color-flash image as the guidance images. In addition, our algorithm can be efficiently computed by applying efficient edge-preserving filtering, such as domain transform filtering.

8.2. Depth Map Refining

Trilateral filtering is effective for refining degraded depth maps by lossy compressing [29]. In the case of a single guidance image, the object boundaries in the depth maps are blurred even if the artifacts are removed; hence, there is a trade-off between denoising performance and the edge preservation effect (Figure 9c). MF can improve this problem by considering both the edges in the depth map and the guidance image (Figure 9d). Although this depth-refinery experiment targets removing coded artifacts, MF can also be applied to noise removal for a depth sensor.

8.3. Feathering

We demonstrate the property of our algorithm beyond the Gauss transforms. Guided feathering [20] refines a binary mask for alpha mating near the boundary of the object. For guided feathering, guided image filtering has excellent performance in terms of efficiency and accuracy [20]. The result of naïve guided image filter is shown in Figure 10c. We can confirm that the feather can be computed in detail; however, several noises are simultaneously caused near the object boundary regions. This is because the local linear model of the guided image filter is violated. By contrast, MRGF results hardly include such noises, as shown in Figure 10d, while the detailed feather is computed. This result indicates that MRGF prevents the violation.

8.4. Haze Removing

Furthermore, our algorithm with guided image filtering is also effective for haze removal [21]. The large kernel size for filtering is required in haze removal; thus, guided image filtering violates the local linear model, as well as guided feathering. Consequently, some haze remains in Figure 11b. On the contrary, MRGF with guided image filtering can suppress expansion in different objects, as shown in Figure 11c.

9. Conclusions

This paper presents an efficient algorithm of edge-preserving filtering with multiple guidance images for sensor fusion signals. Our algorithm, named decomposed multilateral filtering (DMF), can accelerate general multilateral filtering with the Gauss transform and extend various edge-preserving filtering methods to exploit multiple guidance images. In addition, we introduced a method to apply multilateral filtering for the output of multilateral filtering, such as rolling guidance filters [52], named multilateral rolling guidance filtering (MRGF). The experimental results showed that our algorithm has high accuracy and high efficiency. Furthermore, the proposed method is verified by various applications: flash/no-flash denoising, depth map refining, feathering, and Haze removal.
The limitations of our algorithm are that the computational time depends on the image dimensionality and the number of guidance images. However, this problem can be solved by clustering [37,91]. In addition, automatic adjustment of the downsampling amount is also an issue. These issues can be resolved by extending Gaussian KD-trees [47] and permutohedral lattice [48].

Author Contributions

Conceptualization, N.F.; methodology, N.F.; software, H.N.; validation, Y.N. and Y.K.; formal analysis, H.N.; investigation, H.N.; resources, N.F.; data curation, H.N., Y.N. and Y.K.; writing—original draft preparation, H.N.; writing—review and editing, Y.M. and N.F.; visualization, Y.M. and N.F.; supervision, N.F.; project administration, Y.M. and N.F.; funding acquisition, Y.M. and N.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by JSPS KAKENHI (21H03465, 21K17768) and the Environment Research and Technology Development Fund (JPMEERF20222M01) of the Environmental Restoration and Conservation Agency of Japan.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Our code is available at https://fukushimalab.github.io/dmf/ (accessed on 16 January 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
CFJIRcross-field joint image restoration
CTcomputed tomography
DCTdiscrete cosine transform
DMFdecomposed multilateral filtering
DTFdomain transform filtering
FIRfinite impulse response
GIFguided image filtering
HDFhigh-dimensional filtering
HDKFhigh-dimensional kernel filtering
HSVhue, saturation and value
JPEGjoint photographic experts group
LTIlinear-time invariant
LTVlinear-time variant
MFmultilateral filtering
MPEGmoving picture experts group
MRImagnetic resonance imaging
MRGFmultilateral rolling guidance filtering
PBFICsprinciple bilateral filtered image components
PSNRpeak signal-to-noise ratio
RGBRed, green and blue
RGB-Dred, green, blue and depth
RTBFreal-time bilateral filtering
SIMDsingle instruction, multiple data

References

  1. Jia, W.; Song, Z.; Li, Z. Multi-scale Fusion of Stretched Infrared and Visible Images. Sensors 2022, 22, 6660. [Google Scholar] [CrossRef]
  2. Li, H.; Xiao, Y.; Cheng, C.; Song, X. SFPFusion: An Improved Vision Transformer Combining Super Feature Attention and Wavelet-Guided Pooling for Infrared and Visible Images Fusion. Sensors 2023, 23, 7870. [Google Scholar] [CrossRef]
  3. Chen, H.; Deng, L.; Zhu, L.; Dong, M. ECFuse: Edge-Consistent and Correlation-Driven Fusion Framework for Infrared and Visible Image Fusion. Sensors 2023, 23, 8071. [Google Scholar] [CrossRef] [PubMed]
  4. Monno, Y.; Kiku, D.; Tanaka, M.; Okutomi, M. Adaptive Residual Interpolation for Color and Multispectral Image Demosaicking. Sensors 2017, 17, 2787. [Google Scholar] [CrossRef] [PubMed]
  5. Morillas, S.; Gregori, V.; Sapena, A. Adaptive Marginal Median Filter for Colour Images. Sensors 2011, 11, 3205–3213. [Google Scholar] [CrossRef]
  6. Morillas, S.; Gregori, V. Robustifying Vector Median Filter. Sensors 2011, 11, 8115–8126. [Google Scholar] [CrossRef]
  7. Le, A.V.; Jung, S.W.; Won, C.S. Directional Joint Bilateral Filter for Depth Images. Sensors 2014, 14, 11362–11378. [Google Scholar] [CrossRef] [PubMed]
  8. Chen, L.; Li, Q. An Adaptive Fusion Algorithm for Depth Completion. Sensors 2022, 22, 4603. [Google Scholar] [CrossRef]
  9. Takeda, J.; Fukushima, N. Poisson disk sampling with randomized satellite points for projected texture stereo. Opt. Contin. 2022, 1, 974–988. [Google Scholar] [CrossRef]
  10. Cheong, H.; Chae, E.; Lee, E.; Jo, G.; Paik, J. Fast Image Restoration for Spatially Varying Defocus Blur of Imaging Sensor. Sensors 2015, 15, 880–898. [Google Scholar] [CrossRef]
  11. Yang, Y.; Tong, S.; Huang, S.; Lin, P. Dual-Tree Complex Wavelet Transform and Image Block Residual-Based Multi-Focus Image Fusion in Visual Sensor Networks. Sensors 2014, 14, 22408–22430. [Google Scholar] [CrossRef]
  12. Li, Q.; Yang, X.; Wu, W.; Liu, K.; Jeon, G. Multi-Focus Image Fusion Method for Vision Sensor Systems via Dictionary Learning with Guided Filter. Sensors 2018, 18, 2143. [Google Scholar] [CrossRef]
  13. Oishi, S.; Fukushima, N. Retinex-Based Relighting for Night Photography. Appl. Sci. 2023, 13, 1719. [Google Scholar] [CrossRef]
  14. Huang, D.; Tang, Y.; Wang, Q. An Image Fusion Method of SAR and Multispectral Images Based on Non-Subsampled Shearlet Transform and Activity Measure. Sensors 2022, 22, 7055. [Google Scholar] [CrossRef]
  15. Xiao, Y.; Guo, Z.; Veelaert, P.; Philips, W. General Image Fusion for an Arbitrary Number of Inputs Using Convolutional Neural Networks. Sensors 2022, 22, 2457. [Google Scholar] [CrossRef] [PubMed]
  16. Eisemann, E.; Durand, F. Flash Photography Enhancement via Intrinsic Relighting. ACM Trans. Graph. 2004, 23, 673–678. [Google Scholar] [CrossRef]
  17. Petschnigg, G.; Agrawala, M.; Hoppe, H.; Szeliski, R.; Cohen, M.; Toyama, K. Digital Photography with Flash and No-flash Image Pairs. ACM Trans. Graph. 2004, 23, 664–672. [Google Scholar] [CrossRef]
  18. Kopf, J.; Cohen, M.; Lischinski, D.; Uyttendaele, M. Joint Bilateral Upsampling. ACM Trans. Graph. 2007, 26, 6497. [Google Scholar] [CrossRef]
  19. Wada, N.; Kazui, M.; Haseyama, M. Extended Joint Bilateral Filter for the Reduction of Color Bleeding in Compressed Image and Video. ITE Trans. Media Technol. Appl. 2015, 3, 95–106. [Google Scholar] [CrossRef]
  20. He, K.; Shun, J.; Tang, X. Guided Image Filtering. In Proceedings of the European Conference on Computer Vision (ECCV), Crete, Greece, 5–11 September 2010; pp. 1–14. [Google Scholar] [CrossRef]
  21. He, K.; Sun, J.; Tang, X. Single Image Haze Removal using Dark Channel Prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 2341–2353. [Google Scholar]
  22. Shi, Z.; Li, Y.; Zhang, C.; Zhao, M.; Feng, Y.; Jiang, B. Weighted median guided filtering method for single image rain removal. EURASIP J. Image Video Process. 2018, 2018, 35. [Google Scholar] [CrossRef]
  23. Eichhardt, I.; Chetverikov, D.; Janko, Z. Image-guided ToF depth upsampling: A survey. Mach. Vis. Appl. 2017, 28, 267–282. [Google Scholar] [CrossRef]
  24. Matsuo, T.; Fukushima, N.; Ishibashi, Y. Weighted Joint Bilateral Filter with Slope Depth Compensation Filter for Depth Map Refinement. In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), Barcelona, Spain, 21–24 February 2013; pp. 300–309. [Google Scholar]
  25. Scharstein, D.; Szeliski, R. A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms. Int. J. Comput. Vis. 2002, 47, 7–42. [Google Scholar] [CrossRef]
  26. Hosni, A.; Rhemann, C.; Bleyer, M.; Rother, C.; Gelautz, M. Fast Cost-Volume Filtering for Visual Vorrespondence and Beyond. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 504–511. [Google Scholar] [CrossRef] [PubMed]
  27. Baker, S.; Scharstein, D.; Lewis, J.P.; Roth, S.; Black, M.J.; Szeliski, R. A Database and Evaluation Methodology for Optical Flow. Int. J. Comput. Vis. 2011, 92, 1–31. [Google Scholar] [CrossRef]
  28. Tomasi, C.; Manduchi, R. Bilateral Filtering for Gray and Color Images. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Bombay, India, 4–7 January 1998; pp. 839–846. [Google Scholar] [CrossRef]
  29. Lai, P.; Tian, D.; Lopez, P. Depth Map Processing with Iterative Joint Multilateral Filtering. In Proceedings of the Picture Coding Symposium (PCS), Nagoya, Japan, 8–10 December 2010; pp. 9–12. [Google Scholar] [CrossRef]
  30. Gastal, E.S.L.; Oliveira, M.M. Adaptive Manifolds for Real-Time High-Dimensional Filtering. ACM Trans. Graph. 2012, 31, 2185529. [Google Scholar] [CrossRef]
  31. Butt, I.T.; Rajpoot, N.M. Multilateral Filtering: A Novel Framework for Generic Similarity-Based Image Denoising. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Cairo, Egypt, 7–10 November 2009; pp. 2981–2984. [Google Scholar] [CrossRef]
  32. Choudhury, P.; Tumblin, J. The Trilateral Filter for High Contrast Images and Meshes. In Proceedings of the Eurographics Workshop on Rendering, Leuven, Belgium, 25–27 June 2003; pp. 186–196. [Google Scholar]
  33. Gastal, E.S.L.; Oliveira, M.M. Domain Transform for Edge-Aware Image and Video Processing. ACM Trans. Graph. 2011, 30, 1–12. [Google Scholar] [CrossRef]
  34. Sugimoto, K.; Fukushima, N.; Kamata, S. Fast bilateral filter for multichannel images via soft-assignment coding. In Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Jeju, Republic of Korea, 13–15 December 2016; pp. 1–4. [Google Scholar] [CrossRef]
  35. Nair, P.; Chaudhury, K.N. Fast High-Dimensional Kernel Filtering. IEEE Signal Process. Lett. 2019, 26, 377–381. [Google Scholar] [CrossRef]
  36. Miyamura, T.; Fukushima, N.; Waqas, M.; Sugimoto, K.; Kamata, S. Image Tiling for Clustering to Improve Stability of Constant-time Color Bilateral Filtering. In Proceedings of the International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 25–28 October 2020; pp. 1038–1042. [Google Scholar] [CrossRef]
  37. Oishi, S.; Fukushima, N. Tiling and PCA strategy for Clustering-based High-Dimensional Gaussian Filtering. SN Comput. Sci. 2023, 5, 40. [Google Scholar] [CrossRef]
  38. Lin, G.S.; Chen, C.Y.; Kuo, C.T.; Lie, W.N. A Computing Framework of Adaptive Support-Window Multi-Lateral Filter for Image and Depth Processing. IEEE Trans. Broadcast. 2014, 60, 452–463. [Google Scholar] [CrossRef]
  39. Yang, Y.; Liu, Q.; He, X.; Liu, Z. Cross-View Multi-Lateral Filter for Compressed Multi-View Depth Video. IEEE Trans. Image Process. 2019, 28, 302–315. [Google Scholar] [CrossRef]
  40. Durand, F.; Dorsey, J. Fast Bilateral Filtering for the Display of High-Dynamic-Range Images. ACM Trans. Graph. 2002, 21, 257–266. [Google Scholar] [CrossRef]
  41. Paris, S.; Durand, F. A Fast Approximation of the Bilateral Filter using A Signal Processing Approach. Int. J. Comput. Vis. 2009, 81, 24–52. [Google Scholar] [CrossRef]
  42. Yang, Q.; Ahuja, N.; Tan, K.H. Constant Time Median and Bilateral Filtering. Int. J. Comput. Vis. 2014, 112, 307–318. [Google Scholar] [CrossRef]
  43. Yang, Q.; Tan, K.H.; Ahuja, N. Real-Time O(1) Bilateral Filtering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 557–564. [Google Scholar] [CrossRef]
  44. Ghosh, S.; Chaudhury, K.N. On Fast Bilateral Filtering Using Fourier Kernels. IEEE Signal Process. Lett. 2016, 23, 570–573. [Google Scholar] [CrossRef]
  45. Sugimoto, K.; Fukushima, N.; Kamata, S. 200 FPS Constant-time Bilateral Filter Using SVD and Tiling Strategy. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 190–194. [Google Scholar] [CrossRef]
  46. Sumiya, Y.; Fukushima, N.; Sugimoto, K.; Kamata, S. Extending Compressive Bilateral Filtering for Arbitrary Range Kernel. In Proceedings of the International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 25–28 October 2020; pp. 1018–1022. [Google Scholar] [CrossRef]
  47. Adams, A.; Gelfand, N.; Dolson, J.; Levoy, M. Gaussian KD-Trees for Fast High-Dimensional Filtering. ACM Trans. Graph. 2009, 28, 1531327. [Google Scholar] [CrossRef]
  48. Adams, A.; Baek, J.; Davis, M.A. Fast High-Dimensional Filtering Using the Permutohedral Lattice. Comput. Graph. Forum 2010, 29, 753–762. [Google Scholar] [CrossRef]
  49. Maeda, Y.; Fukushima, N.; Matsuo, H. Taxonomy of Vectorization Patterns of Programming for FIR Image Filters Using Kernel Subsampling and New One. Appl. Sci. 2018, 8, 1985. [Google Scholar] [CrossRef]
  50. Maeda, Y.; Fukushima, N.; Matsuo, H. Effective Implementation of Edge-Preserving Filtering on CPU Microarchitectures. Appl. Sci. 2018, 8, 1235. [Google Scholar] [CrossRef]
  51. Naganawa, Y.; Kamei, H.; Kanetaka, Y.; Nogami, H.; Maeda, Y.; Fukushima, N. SIMD-Constrained Lookup Table for Accelerating Variable-Weighted Convolution on x86/64 CPUs. IEEE Access 2024. [Google Scholar] [CrossRef]
  52. Zang, Q.; Shen, X.; Xu, L.; Jia, J. Rolling Guidance Filter. In Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland, 6–12 September 2014. [Google Scholar] [CrossRef]
  53. Pajares, G.; De La Cruz, J.M. A wavelet-based image fusion tutorial. Pattern Recognit. 2004, 37, 1855–1872. [Google Scholar] [CrossRef]
  54. Wang, Z.; Ziou, D.; Armenakis, C.; Li, D.; Li, Q. A comparative analysis of image fusion methods. IEEE Trans. Geosci. Remote Sens. 2005, 43, 1391–1402. [Google Scholar] [CrossRef]
  55. James, A.P.; Dasarathy, B.V. Medical image fusion: A survey of the state of the art. Inf. Fusion 2014, 19, 4–19. [Google Scholar] [CrossRef]
  56. Li, S.; Kang, X.; Fang, L.; Hu, J.; Yin, H. Pixel-level image fusion: A survey of the state of the art. Inf. Fusion 2017, 33, 100–112. [Google Scholar] [CrossRef]
  57. Zhang, H.; Xu, H.; Tian, X.; Jiang, J.; Ma, J. Image fusion meets deep learning: A survey and perspective. Inf. Fusion 2021, 76, 323–336. [Google Scholar] [CrossRef]
  58. Kaur, H.; Koundal, D.; Kadyan, V. Image fusion techniques: A survey. Arch. Comput. Methods Eng. 2021, 28, 4425–4447. [Google Scholar] [CrossRef] [PubMed]
  59. Hermessi, H.; Mourali, O.; Zagrouba, E. Multimodal medical image fusion review: Theoretical background and recent advances. Signal Process. 2021, 183, 108036. [Google Scholar] [CrossRef]
  60. Azam, M.A.; Khan, K.B.; Salahuddin, S.; Rehman, E.; Khan, S.A.; Khan, M.A.; Kadry, S.; Gandomi, A.H. A review on multimodal medical image fusion: Compendious analysis of medical modalities, multimodal databases, fusion techniques and quality metrics. Comput. Biol. Med. 2022, 144, 105253. [Google Scholar] [CrossRef]
  61. Kalamkar, S. Multimodal image fusion: A systematic review. Decis. Anal. J. 2023, 9, 100327. [Google Scholar] [CrossRef]
  62. Singh, S.; Singh, H.; Bueno, G.; Deniz, O.; Singh, S.; Monga, H.; Hrisheekesha, P.; Pedraza, A. A review of image fusion: Methods, applications and performance metrics. Digit. Signal Process. 2023, 137, 104020. [Google Scholar] [CrossRef]
  63. Crow, F.C. Summed-Area Tables for Texture Mapping. In Proceedings of the ACM SIGGRAPH, Minneapolis, MN, USA, 23–27 July 1984; pp. 207–212. [Google Scholar] [CrossRef]
  64. Fukushima, N.; Maeda, Y.; Kawasaki, Y.; Nakamura, M.; Tsumura, T.; Sugimoto, K.; Kamata, S. Efficient Computational Scheduling of Box and Gaussian FIR Filtering for CPU Microarchitecture. In Proceedings of the 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Honolulu, HI, USA, 12–15 November 2018. [Google Scholar]
  65. Deriche, R. Recursively Implementating the Gaussian and its Derivatives. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Singapore, 7–11 September 1992; pp. 263–267. [Google Scholar]
  66. Sugimoto, K.; Kamata, S. Fast Gaussian Filter with Second-Order Shift Property of DCT-5. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Melbourne, Australia, 15–18 September 2013; pp. 514–518. [Google Scholar] [CrossRef]
  67. Takagi, H.; Fukushima, N. An Efficient Description with Halide for IIR Gaussian Filter. In Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Auckland, New Zealand, 7–10 December 2020. [Google Scholar]
  68. Otsuka, T.; Fukushima, N.; Maeda, Y.; Sugimoto, K.; Kamata, S. Optimization of Sliding-DCT based Gaussian Filtering for Hardware Accelerator. In Proceedings of the International Conference on Visual Communications and Image Processing (VCIP), Macau, China, 1–4 December 2020; pp. 423–426. [Google Scholar] [CrossRef]
  69. Sayed, A.H. Fundamentals of Adaptive Filtering; John Wiley & Sons: Hoboken, NJ, USA, 2003. [Google Scholar]
  70. Yoshizawa, S.; Belyaev, A.; Yokota, H. Fast Gauss Bilateral Filtering. Comput. Graph. Forum 2010, 29, 60–74. [Google Scholar] [CrossRef]
  71. Fujita, S.; Fukushima, N. Extending Guided Image Filtering for High-Dimensional Signals. In Communications in Computer and Information Science Book Series, Revised Selected Papers in 11th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016), Rome, Italy, 27–29 February 2016; Springer International Publishing: Berlin/Heidelberg, Germany, 2017; Volume 693, pp. 439–453. [Google Scholar] [CrossRef]
  72. Fattal, R.; Carroll, R.; Agrawala, M. Edge-Based Image Coarsening. ACM Trans. Graph. 2009, 29, 1640449. [Google Scholar] [CrossRef]
  73. Fattal, R. Edge-Avoiding Wavelets and Their Applications. ACM Trans. Graph. 2009, 28, 1531328. [Google Scholar] [CrossRef]
  74. Fukushima, N.; Kawasaki, Y.; Maeda, Y. Accelerating Redundant DCT Filtering for Deblurring and Denoising. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 4175–4179. [Google Scholar] [CrossRef]
  75. Gavaskar, R.G.; Chaudhury, K.N. Fast Adaptive Bilateral Filtering. IEEE Trans. Image Process. 2019, 28, 779–790. [Google Scholar] [CrossRef] [PubMed]
  76. Aubry, M.; Paris, S.; Hasinoff, S.W.; Kautz, J.; Durand, F. Fast Local Laplacian Filters: Theory and Applications. ACM Trans. Graph. 2014, 33, 2629645. [Google Scholar] [CrossRef]
  77. Sumiya, Y.; Otsuka, T.; Maeda, Y.; Fukushima, N. Gaussian Fourier Pyramid for Local Laplacian Filter. IEEE Signal Process. Lett. 2022, 29, 11–15. [Google Scholar] [CrossRef]
  78. Hayashi, K.; Maeda, Y.; Fukushima, N. Local Contrast Enhancement with Multiscale Filtering. In Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Taipei, Taiwan, 31 October–3 November 2023. [Google Scholar]
  79. Mishiba, K. Fast Guided Median Filter. IEEE Trans. Image Process. 2023, 32, 737–749. [Google Scholar] [CrossRef] [PubMed]
  80. Tsubokawa, T.; Tajima, H.; Maeda, Y.; Fukushima, N. Local look-up table upsampling for accelerating image processing. Multimed. Tools Appl. 2023. [Google Scholar] [CrossRef]
  81. Farbman, Z.; Fattal, R.; Lischinski, D.; Szeliski, R. Edge-Preserving Decompositions for Multi-Scale Tone and Detail Manipulation. ACM Trans. Graph. 2008, 27, 1360666. [Google Scholar] [CrossRef]
  82. Min, D.; Choi, S.; Lu, J.; Ham, B.; Sohn, K.; Do, M. Fast Global Image Smoothing Based on Weighted Least Squares. IEEE Trans. Image Process. 2014, 23, 5638–5653. [Google Scholar] [CrossRef]
  83. Xu, L.; Lu, C.; Xu, Y.; Jia, J. Image Smoothing via L0 Gradient Minimization. ACM Trans. Graph. 2011, 30, 2024208. [Google Scholar] [CrossRef]
  84. Kanetaka, Y.; Takagi, H.; Maeda, Y.; Fukushima, N. SlidingConv: Domain-Specific Description of Sliding Discrete Cosine Transform Convolution for Halide. IEEE Access 2024, 12, 7563–7583. [Google Scholar] [CrossRef]
  85. Huynh-Thu, Q.; Ghanbari, M. Scope of validity of PSNR in image/video quality assessment. Electron. Lett. 2008, 44, 800–801. [Google Scholar] [CrossRef]
  86. Hore, A.; Ziou, D. Image quality metrics: PSNR vs. SSIM. In Proceedings of the International Conference on Pattern Recognition (ICPR), Istanbul, Turkey, 23–26 August 2010; pp. 2366–2369. [Google Scholar] [CrossRef]
  87. Honda, S.; Maeda, Y.; Fukushima, N. Dataset of Subjective Assessment for Visually Near-Lossless Image Coding based on Just Noticeable Difference. In Proceedings of the International Conference on Quality of Multimedia Experience (QoMEX), Ghent, Belgium, 20–22 June 2023; pp. 236–239. [Google Scholar] [CrossRef]
  88. Yan, Q.; Shen, X.; Xu, L.; Zhuo, S.; Zhang, X.; Shen, L.; Jia, J. Cross-Field Joint Image Restoration via Scale Map. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, NSW, Australia, 1–8 December 2013. [Google Scholar] [CrossRef]
  89. Ishikawa, K.; Oishi, S.; Fukushima, N. Principal Component Analysis for Accelerating Color Bilateral Filtering. In Proceedings of the International Workshop on Advanced Imaging Technology (IWAIT), Jeju, Republic of Korea, 9–11 January 2023; Volume 12592, p. 125921F. [Google Scholar] [CrossRef]
  90. Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
  91. Otsuka, T.; Fukushima, N. Vectorized Implementation of K-means. In Proceedings of the International Workshop on Advanced Image Technology (IWAIT), Kagoshima, Japan, 5–6 January 2021. [Google Scholar] [CrossRef]
Figure 1. Algorithm overview. n-lateral filtering denotes multilateral filtering that multiplies a spatial filter and n 1 range filters. Examples of multiple guidance information are flash images, segmentation masks, and depth maps. The key point of the proposed algorithm is that it decomposes multilateral filtering into a set of constant-time filters. For more information on implementation, our code is available at https://fukushimalab.github.io/dmf/ (accessed on 16 January 2024).
Figure 1. Algorithm overview. n-lateral filtering denotes multilateral filtering that multiplies a spatial filter and n 1 range filters. Examples of multiple guidance information are flash images, segmentation masks, and depth maps. The key point of the proposed algorithm is that it decomposes multilateral filtering into a set of constant-time filters. For more information on implementation, our code is available at https://fukushimalab.github.io/dmf/ (accessed on 16 January 2024).
Sensors 24 00633 g001
Figure 2. Difference of filtering weights between HDF and MF. HDF weights are computed using the method in [30]. The red point represents the target pixel to compute the kernel weights.
Figure 2. Difference of filtering weights between HDF and MF. HDF weights are computed using the method in [30]. The red point represents the target pixel to compute the kernel weights.
Sensors 24 00633 g002
Figure 3. Procedure of splatting in DMF.
Figure 3. Procedure of splatting in DMF.
Sensors 24 00633 g003
Figure 4. Multilateral rolling guidance filtering. Note that the constraining information J 2 is the same as the filtering image I when the filtering process is the first time.
Figure 4. Multilateral rolling guidance filtering. Note that the constraining information J 2 is the same as the filtering image I when the filtering process is the first time.
Sensors 24 00633 g004
Figure 5. PSNR accuracy with respect to the number of coefficient images. The parameters are σ r 1 = 32 , σ r 2 = 32 , σ s = 8 . SS denotes spatial subsampling rates, × 1 16 . We tested 4 images for the input image.
Figure 5. PSNR accuracy with respect to the number of coefficient images. The parameters are σ r 1 = 32 , σ r 2 = 32 , σ s = 8 . SS denotes spatial subsampling rates, × 1 16 . We tested 4 images for the input image.
Sensors 24 00633 g005
Figure 6. PSNR accuracy with respect to smoothing parameters. (a) σ r 1 = 32 and σ r 2 = 32 . (b) σ s = 8 and σ r 2 = 32 . (c) σ s = 8 and σ r 1 = 32 . SS denotes the spatial subsampling rate, which is × 1 16 . The input images are the same as in Figure 5.
Figure 6. PSNR accuracy with respect to smoothing parameters. (a) σ r 1 = 32 and σ r 2 = 32 . (b) σ s = 8 and σ r 2 = 32 . (c) σ s = 8 and σ r 1 = 32 . SS denotes the spatial subsampling rate, which is × 1 16 . The input images are the same as in Figure 5.
Sensors 24 00633 g006
Figure 7. Processing time. The input image size is 1 megapixel (1024 × 1024). The parameters are T 3 r d = 8 , and T 2 n d = 8 . SS denotes spatial subsampling rate ( 1 16 ).
Figure 7. Processing time. The input image size is 1 megapixel (1024 × 1024). The parameters are T 3 r d = 8 , and T 2 n d = 8 . SS denotes spatial subsampling rate ( 1 16 ).
Sensors 24 00633 g007
Figure 8. Flash/no-flash denoising without the false edge in the flash image. The parameters are σ r 1 = 64 (for joint bilateral filtering and DTF), σ r 2 = 16 , σ s = 8 , T 3 r d = 16 .
Figure 8. Flash/no-flash denoising without the false edge in the flash image. The parameters are σ r 1 = 64 (for joint bilateral filtering and DTF), σ r 2 = 16 , σ s = 8 , T 3 r d = 16 .
Sensors 24 00633 g008
Figure 9. Depth map refining. (b) is coded by JPEG (quality factor = 50). The parameters are σ r 1 = 16 , σ r 2 = 16 , σ s = 2 , T 3 r d = 16 and T 2 n d = 16 . The values of the ratio of bad pixels [25] (error threshold is 1.0) in (bd) are 12.47, 9.24, and 5.38, respectively.
Figure 9. Depth map refining. (b) is coded by JPEG (quality factor = 50). The parameters are σ r 1 = 16 , σ r 2 = 16 , σ s = 2 , T 3 r d = 16 and T 2 n d = 16 . The values of the ratio of bad pixels [25] (error threshold is 1.0) in (bd) are 12.47, 9.24, and 5.38, respectively.
Sensors 24 00633 g009
Figure 10. Guided feathering. MRGF’s results are in Figure 4. The parameters are r = 20 , ϵ = 10 6 , σ r = 160 and T 3 r d = 4 . Red boxes indicate magnified areas.
Figure 10. Guided feathering. MRGF’s results are in Figure 4. The parameters are r = 20 , ϵ = 10 6 , σ r = 160 and T 3 r d = 4 . Red boxes indicate magnified areas.
Sensors 24 00633 g010
Figure 11. Haze removing. (f), (g), and (h) are the details of (a), (b), and (c) with red boxes, respectively. Our result has been computed by MRGF in Figure 4. The parameters are r = 60 , ϵ = 10 6 , σ r = 60 and T 3 r d = 16 .
Figure 11. Haze removing. (f), (g), and (h) are the details of (a), (b), and (c) with red boxes, respectively. Our result has been computed by MRGF in Figure 4. The parameters are r = 60 , ϵ = 10 6 , σ r = 60 and T 3 r d = 16 .
Sensors 24 00633 g011
Table 1. PSNR accuracy of denoising (dB). The Gaussian noise parameter is σ = 10 . * Filters that already use the characteristics of the guide image. Bold numbers mean the best results.
Table 1. PSNR accuracy of denoising (dB). The Gaussian noise parameter is σ = 10 . * Filters that already use the characteristics of the guide image. Bold numbers mean the best results.
ImageDCTDMF-DCTDTFDMF-DTFGIFDMF-GIFCFJIR *HDKF *
036.7336.7334.4135.1134.3535.2635.2235.55
133.2333.4131.3731.9732.3433.1432.1333.53
236.1636.1334.5034.7234.4535.1034.5435.67
339.5640.5437.5337.8837.3039.3840.5240.49
436.3136.4734.1134.7333.9035.9735.4636.77
535.5135.4633.7634.2633.8234.6834.3834.57
634.0633.9932.2132.8532.4933.2532.9333.05
736.5636.6934.6935.2534.6037.2837.5237.70
834.5034.3833.0433.6733.5632.9932.8033.42
936.5836.8233.2834.1033.7737.0337.4937.62
Average35.9236.0633.8934.4534.0635.4135.3035.84
Table 2. PSNR accuracy of denoising (dB). The Gaussian noise parameter is σ = 20 . * Filters that already use the characteristics of the guide image. Bold numbers mean the best results.
Table 2. PSNR accuracy of denoising (dB). The Gaussian noise parameter is σ = 20 . * Filters that already use the characteristics of the guide image. Bold numbers mean the best results.
ImageDCTDMF-DCTDTFDMF-DTFGIFDMF-GIFCFJIR *HDKF *
032.8132.8929.9031.4229.9831.9132.2632.09
129.5029.9227.7628.8228.2829.7529.5030.31
231.6432.2829.4730.3429.6631.4430.8631.60
336.2437.3434.1834.5833.4536.0438.5936.54
432.4532.8529.1831.0529.0932.3132.4233.13
531.6731.8529.5630.4929.6331.0231.0531.07
630.0330.2227.5029.0727.7729.9029.8429.77
732.9233.2731.3032.2431.0334.0334.9834.31
830.7730.9928.5629.9729.0730.0829.9930.32
932.6133.2129.1530.7429.3233.5134.4734.02
Average32.0632.4829.6630.8729.7332.0032.4032.32
Table 3. PSNR accuracy of denoising (dB). The Gaussian noise parameter is σ = 30 . * Filters that already use the characteristics of the guide image. Bold numbers mean the best results.
Table 3. PSNR accuracy of denoising (dB). The Gaussian noise parameter is σ = 30 . * Filters that already use the characteristics of the guide image. Bold numbers mean the best results.
ImageDCTDMF-DCTDTFDMF-DTFGIFDMF-GIFCFJIR *HDKF *
030.3530.5527.2429.5027.6130.2130.4230.16
127.6428.2826.0327.4426.4228.2727.8628.15
228.8429.8326.0827.6326.8829.1228.4428.75
333.9335.0132.5433.0030.8234.9936.4134.29
430.1830.8826.2829.3626.5830.6330.3630.48
529.3229.7126.7928.0727.2728.4829.0529.09
627.7228.1824.6527.0925.3927.8828.0027.84
730.6431.1928.9430.5128.8232.6032.8531.71
828.6829.1625.9828.1226.7128.3128.4528.73
930.1431.2226.9429.2227.1832.1731.8030.82
Average29.7430.4027.1528.9927.3730.2730.3630.00
Table 4. PSNR accuracy metrics where higher values indicate better flash/no-flash denoising with PCA (dB). Gaussian noise parameter for no-flash images is σ = 10 . Bold numbers mean the best results.
Table 4. PSNR accuracy metrics where higher values indicate better flash/no-flash denoising with PCA (dB). Gaussian noise parameter for no-flash images is σ = 10 . Bold numbers mean the best results.
Noise123456
028.8230.9230.9631.8433.3333.3333.32
128.4733.9534.5536.3436.3536.3436.33
228.1639.9140.9141.3241.3041.2941.28
328.2737.6639.5639.8039.8039.7939.79
428.2036.3438.0138.7038.8038.7638.74
528.1638.3540.4340.7340.7040.6740.66
628.1840.8542.2042.3142.2942.2742.27
728.2637.4439.4839.8439.8539.8539.84
828.2835.1937.3837.5237.6037.5937.59
928.1539.3541.4241.4041.3841.3641.35
Average28.3037.0038.4938.9839.1439.1239.12
Table 5. SSIM accuracy metrics where higher values indicate better flash/no-flash denoising with PCA. Gaussian noise parameter for no-flash images is σ = 10 . Bold numbers mean the best results.
Table 5. SSIM accuracy metrics where higher values indicate better flash/no-flash denoising with PCA. Gaussian noise parameter for no-flash images is σ = 10 . Bold numbers mean the best results.
Noise123456
00.9320.9650.9650.9670.9750.9740.974
10.8630.9700.9740.9840.9840.9830.983
20.7230.9830.9840.9870.9870.9870.987
30.7510.9730.9830.9830.9830.9830.983
40.7690.9770.9850.9850.9850.9840.984
50.6930.9750.9850.9850.9850.9840.984
60.6500.9740.9840.9830.9830.9820.982
70.7620.9760.9840.9850.9850.9850.985
80.7960.9610.9810.9810.9800.9800.980
90.6760.9730.9840.9840.9840.9830.983
Average0.7620.9730.9810.9820.9830.9830.983
Table 6. Processing time of filtering with multi-channel guide images.
Table 6. Processing time of filtering with multi-channel guide images.
ChannelsTime (msec)
129.7
279.1
3455.9
43380.4
527,224.7
6224,146.0
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nogami, H.; Kanetaka, Y.; Naganawa, Y.; Maeda, Y.; Fukushima, N. Decomposed Multilateral Filtering for Accelerating Filtering with Multiple Guidance Images. Sensors 2024, 24, 633. https://doi.org/10.3390/s24020633

AMA Style

Nogami H, Kanetaka Y, Naganawa Y, Maeda Y, Fukushima N. Decomposed Multilateral Filtering for Accelerating Filtering with Multiple Guidance Images. Sensors. 2024; 24(2):633. https://doi.org/10.3390/s24020633

Chicago/Turabian Style

Nogami, Haruki, Yamato Kanetaka, Yuki Naganawa, Yoshihiro Maeda, and Norishige Fukushima. 2024. "Decomposed Multilateral Filtering for Accelerating Filtering with Multiple Guidance Images" Sensors 24, no. 2: 633. https://doi.org/10.3390/s24020633

APA Style

Nogami, H., Kanetaka, Y., Naganawa, Y., Maeda, Y., & Fukushima, N. (2024). Decomposed Multilateral Filtering for Accelerating Filtering with Multiple Guidance Images. Sensors, 24(2), 633. https://doi.org/10.3390/s24020633

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop