Haze-Aware Attention Network for Single-Image Dehazing

Tong, Lihan; Liu, Yun; Li, Weijia; Chen, Liyuan; Chen, Erkang

doi:10.3390/app14135391

Open AccessArticle

Haze-Aware Attention Network for Single-Image Dehazing

by

Lihan Tong

¹,

Yun Liu

²

,

Weijia Li

³

,

Liyuan Chen

¹ and

Erkang Chen

^1,*

¹

School of Ocean Information Engineering, Jimei University, Xiamen 361021, China

²

College of Artificial Intelligence, Southwest University, Chongqing 400715, China

³

School of Computer Science, Jimei University, Xiamen 361021, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(13), 5391; https://doi.org/10.3390/app14135391

Submission received: 7 May 2024 / Revised: 1 June 2024 / Accepted: 2 June 2024 / Published: 21 June 2024

(This article belongs to the Special Issue Advances in Image Enhancement and Restoration Technology)

Download

Browse Figures

Versions Notes

Abstract

:

Single-image dehazing is a pivotal challenge in computer vision that seeks to remove haze from images and restore clean background details. Recognizing the limitations of traditional physical model-based methods and the inefficiencies of current attention-based solutions, we propose a new dehazing network combining an innovative Haze-Aware Attention Module (HAAM) with a Multiscale Frequency Enhancement Module (MFEM). The HAAM is inspired by the atmospheric scattering model, thus skillfully integrating physical principles into high-dimensional features for targeted dehazing. It picks up on latent features during the image restoration process, which gives a significant boost to the metrics, while the MFEM efficiently enhances high-frequency details, thus sidestepping wavelet or Fourier transform complexities. It employs multiscale fields to extract and emphasize key frequency components with minimal parameter overhead. Integrated into a simple U-Net framework, our Haze-Aware Attention Network (HAA-Net) for single-image dehazing significantly outperforms existing attention-based and transformer models in efficiency and effectiveness. Tested across various public datasets, the HAA-Net sets new performance benchmarks. Our work not only advances the field of image dehazing but also offers insights into the design of attention mechanisms for broader applications in computer vision.

Keywords:

single-image dehazing; haze-aware attention; multiscale frequency enhancement module

1. Introduction

Single-image dehazing [1,2,3] aims to eliminate haze from images, thus accurately restoring the details of a clean background. This process is recognized as a classic example of an ill-posed problem, where the solution is not unique. Despite these challenges, single-image dehazing is a key area of research due to its wide range of applications in many computer vision tasks. In numerous computer vision tasks, these include outdoor surveillance [4], outdoor scene understanding [5,6], and object detection [7,8]. The visual effect of fog is crucial for improving the accuracy and effectiveness of these tasks. Therefore, the pursuit of effective single-image dehazing methods has become a focal point of research. Over the past decade, this field has attracted a lot of interest from researchers and engineers, thus leading to the development of various innovative technologies and algorithms. These efforts are driven by the potential benefits that dehazing can bring to many applications, thus making it a dynamic and vibrant area of study within the broader field of computer vision.

In recent years, the swift advancement in deep learning technologies has brought attention-based dehazing networks to the forefront of interest. The key to their growing popularity lies in the attention module’s ability to selectively target various areas and channels. This adaptability is especially valuable for dehazing networks, given the uneven spatial spread of haze degradation. Such attention modules offer a tailored approach to effectively restoring clean images, thus addressing the specific challenges posed by haze. The attention-based dehazing methods [2] have achieved performance far beyond that of traditional physical model-based methods [1]. The PCFA module [9] taps feature pyramid and channel attention to extract and focus on crucial image features for effective dehazing. Zhang et al. [10] introduced an RMAM with an attention block to help networks concentrate on key features during learning. Zhang et al. [11] designed a network that teams up multilevel feature blending with mixed convolution attention to steadily and smartly boost dehazing results. However, the explanation behind these attention-based approaches remains unclear, as they do not have a solid link to the atmospheric scattering model [1,12]. Additionally, some of these techniques [2,13,14] employ intricate designs or self-attention mechanisms, thus leading to suboptimal efficiency. In this study, we take a fresh look at the attention mechanism designs in dehazing networks and introduce a new approach inspired by physical priors [15,16,17], which is named the Haze-Aware Attention Module (HAAM). This module ingeniously applies the attention mechanism to mimic the parameters of the atmospheric scattering model, thus expressing these parameters through high-dimensional features. By employing these features constrained by the physical model, we conducted the dehazing process within the feature space, thus achieving outstanding results in both performance and efficiency.

In addition to HAAM, we also developed a new Multiscale Frequency Enhancement Module (MFEM) designed to boost high-frequency details without relying on wavelet [18] or Fourier transforms [19]. MFEM uses a 4-scale receptive field to extract contextual features, thus fully adapting to receptive fields of different sizes. It also emphasizes important information in the frequency domain through lightweight kernel learning parameters in the channel dimension, thus effectively enhancing the dehazing effect. This approach sidesteps the extra computational work typically needed for these methods’ reverse transformations, thus resulting in an efficient and stable enhancement of features. By combining our proposed HAAM and MFEM with the straightforward U-Net architecture, we have created the Haze-Aware Attention Network (HAA-Net) for single-image dehazing. Our method has been tested on both synthetic and real-world datasets. In terms of metrics and visual effects, our approach significantly outperformed traditional attention-based methods, as well as transformer-based methods.

The contributions of this work are summarized as follows:

We developed an efficient attention mechanism known as the HAAM, which is inspired by the atmospheric scattering model that smartly incorporates physical principles into high-dimensional features.
We crafted a multiscale frequency enhancement module that tunes high-frequency features, thus effectively bringing back the finer details of hazy images.
Our HAA-Net set new benchmarks in performance across several public datasets. Notably, it reached the PSNR/SSIM of 41.23 dB/0.996 on the RESIDE-Indoor dataset, thus showcasing its exceptional dehazing performance.

2. Related Work

2.1. Prior-Based Image Dehazing

Research on single-image dehazing in computer vision and computer graphics has been widely explored. Traditional methods have relied on priors such as the dark channel prior (DCP) [12], color attenuation prior [20], and nonlocal prior [21] to estimate scattering light, atmospheric light, depth, and transmission map [12,15,20,21,22]. These methods are backed by strong principles and are interpretable. However, they may not perform well in real-world image dehazing scenarios because they only extract features based on the atmospheric scattering model at the image level, without accessing deep latent features. The Atmospheric Scattering Model (ASM) has been the cornerstone for many previous works in constructing image dehazing networks. These methods have explicitly incorporated the ASM to enhance the generalization capability of their models, thus thoroughly validating the effectiveness of the ASM. In contrast to these approaches, we introduced the ASM at the feature level, thus leveraging it to learn more latent features. Additionally, we allocated a substantial number of channels for the atmospheric light value A, thus aiming to better adapt to the complexities of real-world scenarios.

2.2. Deep Learning-Based Image Dehazing

Due to the inability of prior-based dehazing methods to adapt well to all haze scenes, recent dehazing efforts have moved away from using priors. Some end-to-end networks directly estimated haze-free images [2,3,14,23,24,25,26,27]. AECRNet, SFNet, and others use a U-shaped structure, which has been proven to be superior for haze removal. These methods have achieved some results, but they perform poorly in dehazing real images. Recently, transformers [28,29,30,31,32] have been used in image tasks due to their advantage in capturing long-range relationships. However, their computational complexity increases quadratically with resolution, thereby making them unsuitable for pixel-to-pixel tasks like dehazing. Moreover, these methods lack theoretical interpretability. Therefore, instead of using transformers, we developed our own more efficient attention mechanism based on physical priors.

2.3. Attention-Based Image Dehazing

Attention mechanisms have been playing a crucial role in the field of dehazing. A lot of effective attention mechanisms have been proposed to enhance hazy images. FFA-Net [2] introduced attention mechanisms and achieved impressive results in metrics like the PSNR and SSIM. MSAFF-Net [33] used a channel attention module and a multiscale spatial attention module to focus on areas with features related to fog. Chen et al. [34] proposed the Detail-Enhanced Attention Block (DEAB), which enhances feature learning by combining Detail-Enhanced Convolution and Content-Guided Attention, thereby further improving dehazing performance. Zhang et al. [35] proposed a Residual Nonlocal Attention Network that takes into account the uneven distribution of information in corrupted images. They designed both local and nonlocal attention blocks to extract features for high-quality image restoration. Mou et al. [36] introduced the COLA-Net for image restoration, which combines local and nonlocal attention mechanisms to restore areas with complex textures and highly repetitive details. However, they had high complexity and slow processing during the hazing process. These methods overlook physical characteristics. To tackle this, we propose the Haze-Aware Attention Module, thereby considering the physical model in the feature space of low-resolution images. By incorporating physical priors, we obtained effective features with fewer parameters, thus leading to higher PSNR and SSIM values.

2.4. Frequency-Based Image Dehazing

Due to the convolution theorem, Fourier analysis is widely used to address various low-level vision problems. Numerous algorithms have been researched and developed from a frequency domain perspective for low-level vision issues. Some CNN-based frameworks [37,38,39] have been utilized to bridge the frequency gap between blurred and GT image pairs. For instance, Chen et al. [40] proposed a hierarchical desnow network based on dual-tree complex wavelet transform to reduce snow noise in images. Yang et al. [41] developed a wavelet transform-based U-Net model to replace traditional upsampling and downsampling operations. Zou et al. [18] employed wavelet transform to divide the input into four frequency sub-bands and processed each sub-band with separate convolutions to prevent interference between different frequency parts. Yu et al. [42] used deep Fourier transform to handle global frequency data and reconstruct the phase spectrum under the guidance of the amplitude spectrum, which then aids in enhancing the learning of local features within the spatial domain. Liu et al. [43] achieved impressive results by removing the haze effect from the low-frequency part based on the prior that haze is typically distributed in the low-frequency spectrum of its multiscale wavelet decomposition. But, these methods all add to the complexity of wavelet or Fourier transforms, thus making the computation more costly. We have explored a more straightforward and efficient Multiscale Frequency Enhancement Module (MFEM), which enriches and emphasizes the frequencies extracted from four size receptive fields using ultralightweight learnable parameters, and it weights the features on the channel dimension, thus achieving satisfactory results.

3. Method

3.1. Image Dehazing

As shown in Figure 1, our dehazing model employs a classic encoder–decoder architecture as its backbone. This framework performs a 4 × downsampling operation, which greatly reduces memory usage during both training and inference, thus enhancing the model’s operational efficiency. It is worth noting that our model uses three different types of activation functions. ReLU, validated in the gUnet [44] study for image dehazing, effectively learns complex patterns for image dehazing. Tanh, with its output range of (−1, 1), constrains the model’s output to prevent extreme values, thus enhancing stability and output quality. Additionally, the model employs a dynamic fusion module to merge features from the downsampling and upsampling layers, which is a strategy that helps to retain more information from the image and strengthens the model’s ability to capture details, thus resulting in a more compact and efficient dehazing model. Within this refined feature space, we have further enhanced feature extraction and optimization through the cascade of HAABs. Each HAAB has been meticulously designed and consists of two key components: a Haze-Aware Attention Module and a Multiscale Frequency Attention Module. The HAAM, inspired by physical priors, guides the network to progressively extract clear, fog-free features, which are crucial for the dehazing effect in images. Meanwhile, the MFEM enriches the features through multiscale modulation, thus intelligently identifying and emphasizing features that contain important information and then fusing these features through channelwise weighted fusion using learnable parameters, which further improves the model’s performance. With this innovative structural design, our dehazing model can effectively handle a variety of complex haze images. It not only preserves the original details of the image but also significantly improves the clarity and quality of the image.

3.2. Haze-Aware Attention Module

We introduce a new Haze-Aware Attention Module(HAAM), which cleverly applies physical models at the feature level to guide the model in feature extraction, thus pulling out a lot of potentially important features. Empirical evidence demonstrates that this module not only provides good interpretability but also significantly improves performance metrics. Impressively, it is likely that the introduction of the physical model makes our module more robust and adaptable for processing real images. As shown in the Figure 2, especially for the restoration of sky areas, it significantly outperformed other state-of-the-art methods, thus making the enhanced images more visually appealing. Next, we will delve into the principles of the model. Initially, leveraging the atmospheric scattering model, the generation of a hazy image can be described as follows:

I (x) = J (x) t (x) + A (1 - t (x)),

(1)

where I symbolizes the hazy image, J represents the GT image, and A denotes the atmospheric light, and the scattering of atmospheric light can lead to a reduction in image contrast and visibility. T is the transmission map, which reflects the proportion of light that travels from a point in the scene to the camera without scattering. X indicates the pixel location. The transmission map is expressed as

t = e^{- β d (x)}

, wherein

β

signifies the atmospheric scattering coefficient, and d signifies the depth of the scene. As the scene depth increases, the amount of light that reaches the camera decreases exponentially. This is because the light encounters more scattering as it passes through the atmosphere. To streamline the process for convolution operations, we reconfigure the equation and reformulate it in a matrix representation as follows:

J T = I - A (1 - T),

(2)

where J, T, I, and A denote the matrix–vector representations of J, t, I, and A, respectively. Based on the equations presented above, we constructed the Haze-Aware Attention Module in an intuitive and effective manner. We assumed that the atmospheric light is uniform and derived A by transforming the global contextual information of the entire image captured through Global Average Pooling (GAP).

A = σ ({Conv}^{(\frac{N}{8}, N)} (ReLU ({Conv}^{(N, \frac{N}{8})} (GAP (X))))) .

(3)

Here, X represents the input feature map,

GAP (\cdot)

signifies global average pooling,

{Conv}^{(N, \frac{N}{8})} (\cdot)

refers to a convolution layer with N input channels and

\frac{N}{8}

output channels,

σ

denotes the Sigmoid activation function, and N is set to 64. In obtaining A, we use a process that first reduces dimensions and then increases them, which improves computational efficiency.

Given that

GAP (\cdot)

captures global information, thus neglecting local details and textures, we employed a

3 \times 3

convolution layer to extract features for T. By introducing physical priors, this approach balances global and local features, thereby facilitating a more effective restoration of hazy images.

T = σ ({Conv}^{(\frac{N}{8}, N)} (ReLU ({Conv}^{(N, \frac{N}{8})} ({Conv}^{(N, N)} (X))))) .

(4)

Subsequently, we performed elementwise multiplication between A and

(1 - T)

.

J T

was obtained via

X - A (1 - T)

. Given that division might lead to training instability, we approximated

T^{'}

, representing

\frac{1}{T}

, using the following formula:

T^{'} = σ ({Conv}^{(\frac{N}{8}, N)} (ReLU ({Conv}^{(N, \frac{N}{8})} ({Conv}^{(N, N)} (X))))) .

(5)

Finally, J was acquired through

J = X - A (1 - T) \cdot T^{'}

.

HAAM is an advanced attention mechanism that stands out significantly from traditional spatial and channel attention mechanisms. Its core advantage lies in its ability to integrate physical prior knowledge, thus allowing the model to learn discriminative clean features more effectively during the training process. These clean features can more accurately reflect the essence of the image, thus reducing the interference of noise and artifacts, which is crucial for high-definition image reconstruction. By integrating physical priors, HAAM not only enhances the model’s understanding and processing capabilities of image content but also strengthens its generalization ability. Moreover, the design of HAAM also takes into account the computational efficiency of the model, thus reducing computational costs while improving module performance and making it a widely applicable attention mechanism.

3.3. Multiscale Frequency Enhancement Module

Traditional image restoration methods often focus on enhancing the frequency characteristics of images to improve their clarity and detail. These methods use transformation techniques, such as wavelet and Fourier transforms, to decompose the frequency features of an image into several different frequency bands. The purpose of this is to create isolation between signals of different frequencies, thus reducing their mutual interference, so that each band can be processed independently. Applying different convolutional kernels to these bands can further extract and enhance information within specific frequency ranges. This approach allows for the optimization of high-frequency details and low-frequency contours separately during image processing to achieve better restoration results. However, there are some limitations to this method. Firstly, it may not accurately identify and select the frequency components that carry the most important information in the image. Secondly, the need to process multiple bands separately not only increases the complexity of the algorithm but also significantly raises the computational cost. Especially during the inverse transformation, the operation must be performed separately for each band, which can be a bottleneck when computational resources are limited. Furthermore, since the size of the degradation blur is always variable, the field of view (receptive field) is crucial in the image restoration process. We propose an exceptionally concise and efficient multiscale frequency enhancement module that employs extremely lightweight learnable parameters to effectively decompose frequencies into distinct components, thereby highlighting parts that contain key information. As depicted in Figure 3, we fully considered the impact of the receptive field in our design by utilizing convolutional kernels of sizes

3 \times 3

,

5 \times 5

, and

7 \times 7

kernels, along with a global kernel, to capture four low-frequency components with different receptive field sizes. By subtracting these low-frequency components from the original input, we were able to generate high-frequency components and enhance the frequency sub-bands carrying significant information through network parameters. Subsequently, we applied these learnable channel weights to different frequency sub-bands. This process not only allows for individual processing of each frequency sub-band but also achieves fine-tuning of features, thus further enhancing the quality of image restoration.

The MFEM primarily consists of two main parts: the decoupler and the modulator. The decoupler acquires various frequency sub-bands using multiscale filtering. The modulator then highlights the significant frequency sub-bands with learnable parameters and processes each sub-band individually through learnable parameters on the channel dimension.

For any input feature map

X \in R^{C \times H \times W}

, we obtain the lowest frequency spectrum through average pooling. Then, by subtracting the low-frequency part from X, we obtain the high-frequency part. To fully capture spectral information from different receptive fields, we process X using kernels of sizes

3 \times 3

,

5 \times 5

, and

7 \times 7

and a global kernel. The formula is as follows:

\begin{matrix} X_{g}^{l} & = G A P (X), X_{g}^{h} = X - X_{g}^{l}, \\ X_{3 \times 3}^{l} & = C o n v_{3 \times 3} (X), X_{3 \times 3}^{h} = X - X_{3 \times 3}^{l}, \\ X_{5 \times 5}^{l} & = C o n v_{5 \times 5} (X), X_{5 \times 5}^{h} = X - X_{5 \times 5}^{l}, \\ X_{7 \times 7}^{l} & = C o n v_{7 \times 7} (X), X_{7 \times 7}^{h} = X - X_{7 \times 7}^{l} . \end{matrix}

(6)

In this context,

X_{g}^{l}

and

X_{g}^{h}

represent the global low-frequency and high-frequency sub-bands, respectively.

X_{3 \times 3}^{l}

,

X_{3 \times 3}^{h}

,

X_{5 \times 5}^{l}

,

X_{5 \times 5}^{h}

,

X_{5 \times 5}^{h}

,

X_{7 \times 7}^{l}

, and

X_{7 \times 7}^{h}

denote the low-frequency and high-frequency sub-bands for different receptive field sizes. To emphasize the frequency sub-bands that carry important information, we apply learnable weight parameters to the obtained frequency sub-bands for weighting. Taking only the global receptive field as an example, the formula is as follows:

\tilde{X_{g}^{l}} = M_{g}^{l} X_{g}^{l},

(7)

\tilde{X_{g}^{l}}

represents the global low-frequency sub-band after emphasizing important information. Finally, we modulate the weighted frequency sub-bands along the channel dimension using learnable parameters. The final output of the MFEM is obtained by summing these elements together.

\begin{matrix} M F E M (X) = W_{g}^{c} (\tilde{X_{g}^{l}} + \tilde{X_{g}^{h}}) + W_{3 \times 3}^{c} ({\tilde{X}}_{3 \times 3}^{l} + {\tilde{X}}_{3 \times 3}^{h}) + \\ W_{5 \times 5}^{c} ({\tilde{X}}_{5 \times 5}^{l} + {\tilde{X}}_{5 \times 5}^{h}) + W_{7 \times 7}^{c} ({\tilde{X}}_{7 \times 7}^{l} + {\tilde{X}}_{7 \times 7}^{h}), \end{matrix}

(8)

W_{k}^{c}, k = {1 \times 1, 3 \times 3, 5 \times 5, 7 \times 7}

represents the channel attention weight maps from the filters of various scales.

Our MFEM excels at addressing uneven fog densities and irregular shapes in images, thus successfully achieving the goal of high-quality image reconstruction. This module, with its innovative multiscale processing approach, can accurately identify and handle various details and textures within the image, thus maintaining clarity and realism even under complex conditions. The core strength of MFEM lies in its fine control over different frequency components, thus allowing it to provide customized treatment for every detail in the image. By separately optimizing low-frequency and high-frequency information, the MFEM can significantly enhance the clarity of edges and textures while preserving the overall structure of the image. Moreover, the lightweight design of the module also means it has a significant advantage in computational efficiency, thereby enabling it to quickly process a large amount of image data without compromising performance.

For the loss function, We designate the dehazed image, the clear ground truth

J_{g t}

, and the hazy image I as the anchor, positive sample, and negative sample, respectively:

L_{C R} = C R (HAA ‐ Net (I), J_{g t}, I)

. Finally, we combine the loss obtained from Contrastive Regularization with the

L 1

loss function to form the final loss function:

L_{t o t a l} = λ L_{C R} + L_{1} (H A A ‐ N e t (I), J_{g t}) .

(9)

Our experimental validation shows that when

λ_{1} = 0.5

, excellent metrics were achieved.

4. Experiments

4.1. Implementation Details

We used the PyTorch 1.11.0 version on Four NVIDIA RTX 4090 GPUs (Santa Clara, CA, USA) to conduct all the experiments. When training, the images were randomly cropped to

320 \times 320

patches. When calculating model complexity, we set the size to

128 \times 128

. We used the Adam optimizer with a decay rate of 0.9 for

β_{1}

and 0.999 for

β_{2}

. The starting learning rate was set at 0.00015, and we scheduled it with a cosine annealing strategy. The batch size was set to 64. Empirically, we set the penalty parameter

λ

to 0.2 and

γ

to 0.25, and we trained for 80 k steps. We employed Contrastive Regularization (CR) [3] to better restore dehazed images.

4.2. Datasets and Metrics

We used the PSNR and SSIM to evaluate the performance of our HAA-Net. We trained and tested the network on five datasets:RESIDE-Indoor [47], Haze4K [48], RTTS [46], RESIDE-Outdoor [47], NH-HAZE [49], and Dense-Haze [50]. Specifically, the RESIDE-Indoor dataset has a total of 13,990 image pairs. We trained our model using 13,000 of those pairs and then tested the model on an additional 990 images from the RESIDE-Indoor set. We also conducted training and testing on the RESIDE-Outdoor dataset, which is larger and offers a more diverse set of data. This process fully demonstrates the model’s generalization capabilities. The Haze4K dataset comprises 4000 image pairs, with 3000 used for training and the remaining 1000 for testing. Compared to RESIDE-Indoor, Haze4K includes both indoor and outdoor scenes, thus making it more realistic. The RTTS dataset consists of 1000 real haze images, which is ideal for assessing the generalization of our model trained on RESIDE-Indoor and Haze4K. It differs significantly from the other two datasets, thus providing a challenging and effective benchmark for evaluating the performance of HAA-Net. The NH-HAZE dataset is made up of 55 image pairs, with 50 pairs used for training and 5 pairs for testing. This setup thoroughly demonstrates our model’s ability to handle fog with uneven distribution and varying densities. The Dense-Haze dataset comprises 55 pairs, including hazy images of varying sizes and densities along with their corresponding GT images. We utilized 50 pairs for training and reserved 5 pairs for testing, thereby validating the robustness of our model.

4.3. Comparison with State-of-the-Art Methods

Results on Synthetic Dataset. We compared our approach with the state-of-the-art on simulated haze images from the RESIDE-Indoor dataset, Haze4K dataset, and RESIDE-Outdoor dataset. For the RESIDE-Indoor dataset, visually, we can observe that the KDDN [51], MSBDN [25], and AOD-Net [1] suffered from texture details loss and color distortion when dealing with small patches of haze (yellow box in Figure 4). They also exhibited edge distortion issues (green box in Figure 4). While DehazeFormer-L [13], Dehamer [52], and FFA-Net [2] produced improved images, they sometimes overly brightened the images, thus leading to the darkening of certain details (yellow box in Figure 4) and showed slight edge distortion (green box in Figure 4). In contrast, our method excelled in preserving details, the clarity of textures, and color authenticity. For the RESIDE-Outdoor dataset, as shown in Figure 5, you can clearly see that AOD-Net [1] and GridDehazeNet [14] had a lot of haze left in the images. The FFA-Net [2] and KDDN [51] both had some small haze leftovers, and DehazeFormer [13] made the images too dark after enhancing them. Our HAA-Net looks the closest to the clean images. When evaluating the performance on the RESIDE-Indoor and Haze4k datasets, our HAA-Net outperformed all other state-of-the-art methods. On the RESIDE-Indoor test set, as shown in Table 1, the HAA-Net achieved the highest PSNR of 41.21 dB and SSIM of 0.996, thus surpassing the second-best method by a 1.16 dB PSNR improvement while also reducing the parameter count by approximately

30 %

. On the Haze4k dataset, as shown in Table 1, the HAA-Net continued to demonstrate superior performance, thus achieving a PSNR of 33.93 dB and an SSIM of 0.99.

Real-World Visual Comparisons. We performed real-world haze tests using samples from both the RTTS, NH-HAZE, and Dense-Haze datasets, which are more challenging than synthetic ones. The RTTS dataset includes dense and uneven haze, thereby really testing the robustness and effectiveness of the model. As shown in Figure 2, AOD-Net [1] had a lot of large fog remnants, and both GridDehazeNet [14] and FFA-Net [2] had quite a bit of fog left over, with overenhancement in the sky areas. The MSBDN [25], KDDN [51], and Dehamer [52] all had residual fog, and when the fog was heavy, the enhanced images turned out too dark. DehazeFormer-L [13] also had residual fog and severe texture loss. Clearly, the images restored by our HAA-Net are clear in texture and realistic in color, thus closely matching the clean images. This fully demonstrates the superior robustness and effectiveness of our method. The NH-HAZE is a nonhomogeneous real image dehazing dataset. As shown in Figure 6, by comparing side by side, it is clear that our HAA-Net could perfectly adapt to haze of different concentrations. And our HAA-Net achieved a PSNR of 21.32 dB and an SSIM of 0.692, which is significantly better than other state-of-the-art methods. The Dense-Haze dataset includes images with various extents and densities of haze, which poses a more significant challenge for dehazing. Nonetheless, our HAA-Net method has surpassed the best-performing methods to date, thus obtaining a PSNR of 18.74 dB and an SSIM of 0.620. These achievements strongly validate that our HAA-Net is capable of effectively handling haze of different magnitudes and concentrations.

4.4. Ablation Study

We performed an ablation study on our HAA-Net using the Haze4k dataset, thus gradually beefing up the key parts of the model to really show how effective each module is, as shown in Table 2. We started with a “Base” model, which is a straightforward U-Net structure with basic

3 \times 3

depth convolutions. When the HAAM was added to the model, its performance improved significantly, thus achieving a PSNR of 31.76 dB and an SSIM of 0.97. What is truly impressive is that the PCNR increased by 6.3 with the addition of the HAAM, thus demonstrating the effectiveness of the HAAM module for dehazing images. As we continued to enhance the HAAM, we observed the PSNR rising even higher to 32.32 dB, along with an SSIM of 0.98. When we integrated the MFEM with the HAAM, the model truly excelled, thus achieving a PSNR of 33.46 dB and an SSIM of 0.99. Furthermore, by incorporating SKFusion technology, we elevated the PSNR to a new peak of 33.93 dB while maintaining an SSIM of 0.99. These outcomes not only validate the effectiveness of the modules we have developed but also establish HAA-Net as a standout performer in the field of image dehazing.

5. Limitations

While our method shows excellent performance, it is not without its limitations. Specifically, due to the high complexity of our HAA-Net model and the inclusion of attention mechanisms, the number of parameters is relatively high. This could lead to increased computational costs and pose challenges in situations where computational resources are constrained. Additionally, although our network’s complex design is advantageous for capturing fine-grained features in hazy images, it also results in a more complex model structure. This complexity may potentially affect the model’s interpretability and could require more training data to achieve optimal performance.

Unfortunately, despite delivering remarkable results, our model, with a parameter count of 18.7 million and a computational complexity of 122.48 GMacs, still requires further optimization to be deployable on embedded devices. The deployment on embedded systems will necessitate further trade-offs between the actual performance and operational speed.

6. Conclusions

In this paper, we have introduced the HAA-Net, a novel image dehazing framework that uses the U-Net structure and includes HAABs. This HAAB is made up of two key parts: the Haze-Aware Attention Module and the Multiscale Frequency Enhancement Module. The HAAM cleverly mixes in physical rules at the feature level, which helps the network pick up more useful underlying details during image restoration. It is likely that by including these physical models in our HAA-Net, we have managed to obtain some really impressive results when it comes to clearing up real-world hazy images. On another note, the MFEM focuses on pulling out frequency features using a multiscale field of view and highlights important information across different channels, thus making it great for dealing with fog of all sizes and densities. We put our model to the test on both made-up and real-world datasets, and the thorough evaluations really show that the HAA-Net is robust and effective for all kinds of dehazing tasks. It is clear that our method outperforms other state-of-the-art methods, thus proving its potential as a leading solution in the fields of image processing and computer vision.

Author Contributions

Conceptualization, L.T. and E.C.; data curation, L.T.; formal analysis, Y.L.; methodology, L.T.; resources, L.C.; software, L.T. and W.L.; supervision, Y.L. validation, Y.L., W.L. and L.C.; writing—original draft preparation, L.T., E.C. and Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Youth Science and Technology Innovation Program of Xiamen Ocean and Fisheries Development Special Funds (23ZHZB039QCB24), Xiamen Ocean and Fisheries Development Special Funds (22CZB013HJ04), the National Natural Science Foundation of China (Grant No. 62301453).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. These data can be found here: https://sites.google.com/view/reside-dehaze-datasets (accessed on 22 April 2019), https://pan.baidu.com/share/init?surl=41MW0YAvjFcydlroQZZizA (accessed on 6 August 2021).

Conflicts of Interest

The authors declare no conflict of interest.

References

Li, B.; Peng, X.; Wang, Z.; Xu, J.; Feng, D. Aod-net: All-in-one dehazing network. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4770–4778. [Google Scholar]
Qin, X.; Wang, Z.; Bai, Y.; Xie, X.; Jia, H. FFA-Net: Feature fusion attention network for single image dehazing. Proc. AAAI Conf. Artif. Intell. 2020, 34, 11908–11915. [Google Scholar] [CrossRef]
Wu, H.; Qu, Y.; Lin, S.; Zhou, J.; Qiao, R.; Zhang, Z.; Xie, Y.; Ma, L. Contrastive learning for compact single image dehazing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 10551–10560. [Google Scholar]
Ye, T.; Jiang, M.; Zhang, Y.; Chen, L.; Chen, E.; Chen, P.; Lu, Z. Perceiving and Modeling Density is All You Need for Image Dehazing. arXiv 2021, arXiv:2111.09733. [Google Scholar]
Sakaridis, C.; Dai, D.; Hecker, S.; Van Gool, L. Model adaptation with synthetic and real data for semantic dense foggy scene understanding. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 687–704. [Google Scholar]
Sakaridis, C.; Dai, D.; Van Gool, L. Semantic foggy scene understanding with synthetic data. Int. J. Comput. Vis. 2018, 126, 973–992. [Google Scholar] [CrossRef]
Li, B.; Peng, X.; Wang, Z.; Xu, J.; Feng, D. End-to-end united video dehazing and detection. Proc. AAAI Conf. Artif. Intell. 2018, 32, 7016–7023. [Google Scholar] [CrossRef]
Chen, Y.; Li, W.; Sakaridis, C.; Dai, D.; Van Gool, L. Domain adaptive faster r-cnn for object detection in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3339–3348. [Google Scholar]
Zhang, X.; Wang, T.; Wang, J.; Tang, G.; Zhao, L. Pyramid channel-based feature attention network for image dehazing. Comput. Vis. Image Underst. 2020, 197, 103003. [Google Scholar] [CrossRef]
Zhang, X.; Wang, T.; Luo, W.; Huang, P. Multi-level fusion and attention-guided CNN for image dehazing. IEEE Trans. Circuits Syst. Video Technol. 2020, 31, 4162–4173. [Google Scholar] [CrossRef]
Zhang, X.; Wang, J.; Wang, T.; Jiang, R. Hierarchical Feature Fusion With Mixed Convolution Attention for Single Image Dehazing. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 510–522. [Google Scholar] [CrossRef]
He, K.; Sun, J.; Tang, X. Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 2341–2353. [Google Scholar] [PubMed]
Song, Y.; He, Z.; Qian, H.; Du, X. Vision transformers for single image dehazing. IEEE Trans. Image Process. 2023, 32, 1927–1941. [Google Scholar] [CrossRef]
Liu, X.; Ma, Y.; Shi, Z.; Chen, J. Griddehazenet: Attention-based multi-scale network for image dehazing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 7314–7323. [Google Scholar]
McCartney, E.J. Optics of the Atmosphere: Scattering by Molecules and Particles; IEEE: New York, NY, USA, 1976. [Google Scholar]
Narasimhan, S.G.; Nayar, S.K. Chromatic framework for vision in bad weather. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition—CVPR 2000 (Cat. No. PR00662), Hilton Head, SC, USA, 15 June 2000; Volume 1, pp. 598–605. [Google Scholar]
Narasimhan, S.G.; Nayar, S.K. Vision and the atmosphere. Int. J. Comput. Vis. 2002, 48, 233–254. [Google Scholar] [CrossRef]
Zou, W.; Jiang, M.; Zhang, Y.; Chen, L.; Lu, Z.; Wu, Y. Sdwnet: A straight dilated network with wavelet transformation for image deblurring. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 1895–1904. [Google Scholar]
Mao, X.; Liu, Y.; Shen, W.; Li, Q.; Wang, Y. Deep residual fourier transformation for single image deblurring. arXiv 2021, arXiv:2111.11745. [Google Scholar]
Zhu, Q.; Mai, J.; Shao, L. Single image dehazing using color attenuation prior. In Proceedings of the BMVC, Nottingham, UK, 1–5 September 2014. [Google Scholar]
Berman, D.; Treibitz, T.; Avidan, S. Non-local image dehazing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1674–1682. [Google Scholar]
Tan, R.T. Visibility in bad weather from a single image. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar]
Ren, W.; Ma, L.; Zhang, J.; Pan, J.; Cao, X.; Liu, W.; Yang, M.H. Gated fusion network for single image dehazing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3253–3261. [Google Scholar]
Qu, Y.; Chen, Y.; Huang, J.; Xie, Y. Enhanced pix2pix dehazing network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 8160–8168. [Google Scholar]
Dong, H.; Pan, J.; Xiang, L.; Hu, Z.; Zhang, X.; Wang, F.; Yang, M.H. Multi-scale boosted dehazing network with dense feature fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2157–2167. [Google Scholar]
Liu, Y.; Yan, Z.; Tan, J.; Li, Y. Multi-Purpose Oriented Single Nighttime Image Haze Removal Based on Unified Variational Retinex Model. IEEE Trans. Circuits Syst. Video Technol. 2023, 33, 1643–1657. [Google Scholar] [CrossRef]
Cui, Y.; Tao, Y.; Bing, Z.; Ren, W.; Gao, X.; Cao, X.; Huang, K.; Knoll, A. Selective frequency network for image restoration. In Proceedings of the Eleventh International Conference on Learning Representations, Virtual Event, 25–29 April 2022. [Google Scholar]
Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 10012–10022. [Google Scholar]
Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5728–5739. [Google Scholar]
Tsai, F.J.; Peng, Y.T.; Lin, Y.Y.; Tsai, C.C.; Lin, C.W. Stripformer: Strip transformer for fast image deblurring. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 146–162. [Google Scholar]
Zhu, L.; Wang, X.; Ke, Z.; Zhang, W.; Lau, R.W. Biformer: Vision transformer with bi-level routing attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 10323–10333. [Google Scholar]
Liu, Y.; Yan, Z.; Chen, S.; Ye, T.; Ren, W.; Chen, E. Nighthazeformer: Single nighttime haze removal using prior query transformer. In Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, ON, Canada, 29 October–3 November 2023; pp. 4119–4128. [Google Scholar]
Lin, C.; Rong, X.; Yu, X. Msaff-net: Multiscale attention feature fusion networks for single image dehazing and beyond. IEEE Trans. Multimed. 2022, 25, 3089–3100. [Google Scholar] [CrossRef]
Chen, Z.; He, Z.; Lu, Z.M. DEA-Net: Single image dehazing based on detail-enhanced convolution and content-guided attention. IEEE Trans. Image Process. 2024, 33, 1002–1015. [Google Scholar] [CrossRef]
Zhang, Y.; Li, K.; Li, K.; Zhong, B.; Fu, Y. Residual non-local attention networks for image restoration. arXiv 2019, arXiv:1903.10082. [Google Scholar]
Mou, C.; Zhang, J.; Fan, X.; Liu, H.; Wang, R. COLA-Net: Collaborative attention network for image restoration. IEEE Trans. Multimed. 2021, 24, 1366–1377. [Google Scholar] [CrossRef]
Selesnick, I.W.; Baraniuk, R.G.; Kingsbury, N.C. The dual-tree complex wavelet transform. IEEE Signal Process. Mag. 2005, 22, 123–151. [Google Scholar] [CrossRef]
Yoo, J.; Lee, S.h.; Kwak, N. Image restoration by estimating frequency distribution of local patches. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6684–6692. [Google Scholar]
Yang, H.H.; Fu, Y. Wavelet u-net and the chromatic adaptation transform for single image dehazing. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 2736–2740. [Google Scholar]
Chen, W.T.; Fang, H.Y.; Hsieh, C.L.; Tsai, C.C.; Chen, I.; Ding, J.J.; Kuo, S.Y. All snow removed: Single image desnowing algorithm using hierarchical dual-tree complex wavelet representation and contradict channel loss. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 4196–4205. [Google Scholar]
Yang, H.H.; Yang, C.H.H.; Tsai, Y.C.J. Y-net: Multi-scale feature aggregation network with wavelet structure similarity loss function for single image dehazing. In Proceedings of the ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 2628–2632. [Google Scholar]
Yu, H.; Zheng, N.; Zhou, M.; Huang, J.; Xiao, Z.; Zhao, F. Frequency and spatial dual guidance for image dehazing. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 181–198. [Google Scholar]
Liu, X.; Zhang, H.; Cheung, Y.m.; You, X.; Tang, Y.Y. Efficient single image dehazing and denoising: An efficient multi-scale correlated wavelet approach. Comput. Vis. Image Underst. 2017, 162, 23–33. [Google Scholar] [CrossRef]
Song, Y.; Zhou, Y.; Qian, H.; Du, X. Rethinking performance gains in image dehazing networks. arXiv 2022, arXiv:2209.11448. [Google Scholar]
Li, X.; Wang, W.; Hu, X.; Yang, J. Selective kernel networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 510–519. [Google Scholar]
Li, B.; Ren, W.; Fu, D.; Tao, D.; Feng, D.; Zeng, W.; Wang, Z. Benchmarking Single-Image Dehazing and Beyond. IEEE Trans. Image Process. 2019, 28, 492–505. [Google Scholar] [CrossRef]
Zhong, R.Y.; Huang, G.Q.; Dai, Q.; Zhang, T. Mining SOTs and dispatching rules from RFID-enabled real-time shopfloor production data. J. Intell. Manuf. 2014, 25, 825–843. [Google Scholar] [CrossRef]
Liu, Y.; Zhu, L.; Pei, S.; Fu, H.; Qin, J.; Zhang, Q.; Wan, L.; Feng, W. From Synthetic to Real: Image Dehazing Collaborating with Unlabeled Real Data. arXiv 2021, arXiv:2108.02934. [Google Scholar]
Ancuti, C.O.; Ancuti, C.; Timofte, R. NH-HAZE: An Image Dehazing Benchmark with Non-Homogeneous Hazy and Haze-Free Images. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020; pp. 1798–1805. [Google Scholar] [CrossRef]
Ancuti, C.O.; Ancuti, C.; Sbert, M.; Timofte, R. Dense-haze: A benchmark for image dehazing with dense-haze and haze-free images. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 1014–1018. [Google Scholar]
Hong, M.; Xie, Y.; Li, C.; Qu, Y. Distilling Image Dehazing with Heterogeneous Task Imitation. In Proceedings of the CVPR, Seattle, WA, USA, 13–19 June 2020; pp. 3459–3471. [Google Scholar]
Guo, C.L.; Yan, Q.; Anwar, S.; Cong, R.; Ren, W.; Li, C. Image dehazing transformer with transmission-aware 3d position embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5812–5820. [Google Scholar]
Song, X.; Zhou, D.; Li, W.; Dai, Y.; Shen, Z.; Zhang, L.; Li, H. TUSR-Net: Triple Unfolding Single Image Dehazing with Self-Regularization and Dual Feature to Pixel Attention. IEEE Trans. Image Process. 2023, 32, 1231–1244. [Google Scholar] [CrossRef] [PubMed]
Cui, Y.; Ren, W.; Knoll, A. Omni-Kernel Network for Image Restoration. Proc. AAAI Conf. Artif. Intell. 2024, 38, 1426–1434. [Google Scholar] [CrossRef]
Yin, H.; Yang, P. Multi-Stage Progressive Single Image Dehazing Network with Feature Physics Model. IEEE Trans. Instrum. Meas. 2024, 73, 5013612. [Google Scholar] [CrossRef]

Figure 1. The overview of our Haze-Aware Attention Network architecture. We give details of the structure and configurations in Section 3. SKFusion [45] is a feature fusion method.

Figure 2. Visual results comparisons on real-world hazy images from the RTTS dataset [46]. Zoom in for best view.

Figure 3. Multiscale Frequency Enhancement Module. GAP stands for Global Average Pooling. AP k × k means an Average Pooling operation with a kernel size of k × k. Modulation is a process that recalibrates the channels by setting attention weights as directly learnable parameters, without adding any extra layers. Learnable parameters are adjustable values that help adjust the weights at different scales.

Figure 4. Visual results comparisons on RESIDE-Indoor [46] dataset. Zoom in for best view.

Figure 5. Visual results comparisons on synthetic hazy images from the RESIDE-Outdoor dataset [46]. Zoom in for best view.

Figure 6. Visual results comparisons on real-world hazy images from the NH-HAZE dataset [49]. Zoom in for best view.

Table 1. Quantitative comparisons with SOTA methods on the RESIDE-Indoor [46], RESIDE-Outdoor [46], Haze4K [48], NH-Haze [49], and Dense-Haze [50] datasets. Bold font indicates the optimal value for vertical comparison.

Method	RESIDE-Indoor [46]		RESIDE-Outdoor [46]		Haze4k [48]		NH-Haze [49]		Dense-Haze [50]		# Param	# MACs
Method	PSNR (dB)	SSIM	PSNR (dB)	SSIM	PSNR (dB)	SSIM	PSNR (dB)	SSIM	PSNR (dB)	SSIM	# Param	# MACs
(ICCV’17) AOD-Net [1]	19.82	0.818	20.29	0.876	17.15	0.830	15.40	0.569	-	-	0.002 M	-
(ICCV’19) GridDehazeNet [14]	32.16	0.984	30.86	0.982	-	-	13.80	0.537	-	-	0.96 M	-
(AAAI’20) FFA-Net [2]	36.39	0.989	33.57	0.984	26.96	0.950	19.87	0.692	-	-	4.68 M	144.17 G
(CVPR’20) MSBDN [25]	33.79	0.984	-		22.99	0.850	19.23	0.706	-	-	31.35 M	20.79 G
(CVPR’20) KDDN [51]	34.72	0.985	-	-	-	-	17.39	0.590	-	-	5.99 M	-
(CVPR’21) AECR-Net [3]	37.17	0.990	-	-	-	-	19.88	0.717	15.80	0.466	2.61 M	13.05 G
(CVPR’22) Dehamer [52]	36.63	0.988	35.18	0.986	-	-	20.66	0.684	16.62	0.560	132.45 M	29.57 G
(ECCV’22) PMNet [4]	38.41	0.990	-	-	33.49	0.980	-	-	-	-	18.90 M	-
(TIP’23) DehazeFormer-L [13]	40.05	0.996	-	-	32.19	0.980	-	-	-	-	25.44 M	69.93 G
(TIP’23) TUSR-Net [53]	38.67	0.991	-	-	-	-	-	-	18.62	0.560	5.62 M	-
(AAAI’24) OKNet-S [54]	37.59	0.994	35.45	0.992	-	-	20.29	0.800	16.85	0.620	2.40 M	8.93 G
(IEEE Trans. Instrum’24) MSPD-Net [55]	39.88	0.994	-	-	32.97	0.987	-	-	-	-	-	-
HAA-Net (Ours)	41.21	0.996	35.67	0.992	33.93	0.990	21.32	0.792	18.74	0.620	18.70 M	122.48 G

Table 2. Ablation study of our HAA-Net on the Haze4k Dataset [48].

Model	PSNR (dB)	SSIM	# Param	# MACs
Base (U-Net)	25.46	0.91	0.85 M	13.35 G
Base + HAAM	31.76	0.97	8.6 M	122.29 G
Base + MFEM	32.32	0.98	18.2 M	61.90 G
Base + MFEM + HAAM	33.46	0.99	18.6 M	122.44 G
Base + MFEM + HAAM + SKFusion (Full)	33.93	0.99	18.7 M	122.48 G

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tong, L.; Liu, Y.; Li, W.; Chen, L.; Chen, E. Haze-Aware Attention Network for Single-Image Dehazing. Appl. Sci. 2024, 14, 5391. https://doi.org/10.3390/app14135391

AMA Style

Tong L, Liu Y, Li W, Chen L, Chen E. Haze-Aware Attention Network for Single-Image Dehazing. Applied Sciences. 2024; 14(13):5391. https://doi.org/10.3390/app14135391

Chicago/Turabian Style

Tong, Lihan, Yun Liu, Weijia Li, Liyuan Chen, and Erkang Chen. 2024. "Haze-Aware Attention Network for Single-Image Dehazing" Applied Sciences 14, no. 13: 5391. https://doi.org/10.3390/app14135391

APA Style

Tong, L., Liu, Y., Li, W., Chen, L., & Chen, E. (2024). Haze-Aware Attention Network for Single-Image Dehazing. Applied Sciences, 14(13), 5391. https://doi.org/10.3390/app14135391

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Haze-Aware Attention Network for Single-Image Dehazing

Abstract

1. Introduction

2. Related Work

2.1. Prior-Based Image Dehazing

2.2. Deep Learning-Based Image Dehazing

2.3. Attention-Based Image Dehazing

2.4. Frequency-Based Image Dehazing

3. Method

3.1. Image Dehazing

3.2. Haze-Aware Attention Module

3.3. Multiscale Frequency Enhancement Module

4. Experiments

4.1. Implementation Details

4.2. Datasets and Metrics

4.3. Comparison with State-of-the-Art Methods

4.4. Ablation Study

5. Limitations

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI