1. Introduction
Haze image dehazing, a critical task in computer vision and image processing, has become increasingly important in various applications such as autonomous driving, surveillance systems, and remote sensing [
1,
2]. Haze, a common atmospheric phenomenon, significantly reduces visibility in outdoor scenes, degrading image quality and decreasing the accuracy of object recognition and classification tasks [
3,
4]. This degradation poses serious challenges in many computer vision applications, necessitating the development of effective haze removal techniques [
5,
6]. The impact of haze on visual perception extends beyond mere aesthetic concerns, as it can severely compromise the functionality of critical systems that rely on clear visual input, potentially leading to safety issues in autonomous vehicles or misidentifications in security systems.
The history of haze removal techniques dates back to early methods based on physical exposure adjustments. These approaches attempted to remove haze by combining multiple images taken with different exposure times [
7]. While somewhat effective for static scenes, these methods faced significant challenges when applied to dynamic scenes or in situations requiring real-time processing [
8,
9]. The need for specialized camera equipment and complex post-processing steps limited their practical applicability in many real-world scenarios [
10,
11]. These early attempts, although limited in their scope, laid the groundwork for understanding the complex nature of haze and its interaction with light, paving the way for more sophisticated techniques.
To overcome the limitations of physical exposure-based methods, researchers developed image processing algorithms for haze removal. These methods aimed to enhance image contrast or reduce the haze effect through color correction techniques such as histogram equalization and unsharp masking [
12]. However, these approaches often struggled with complex scenes or varying weather conditions, failing to produce satisfactory results when haze density was non-uniform or when dealing with scenes containing intricate depth information [
13,
14]. The shortcomings of these methods highlighted the need for more progressive and context-aware approaches to haze removal.
A significant breakthrough in the field came with the introduction of the Dark Channel Prior (DCP) method by He et al. [
15]. This method leverages the observation that in most haze-free outdoor images, at least one color channel in the RGB(Red, Green, Blue) space has very low intensity in some local regions. By utilizing this prior knowledge to estimate haze transmission, the DCP method effectively removes haze from single images. Despite its groundbreaking nature, the DCP method faced challenges in real-time applications due to its high computational complexity, and it tended to overcorrect in certain scenes, such as those with bright skies [
7,
16]. The DCP method’s success sparked a new wave of research into prior-based dehazing techniques, inspiring numerous variations and improvements.
Building upon the insights gained from the DCP method, subsequent researchers explored haze removal techniques that incorporated depth information of the scene [
17,
18]. These approaches are based on the principle that haze density typically increases with distance from the camera. Various technologies, including stereo vision, structured light, and depth sensors, have been employed to estimate depth information [
19]. While these methods can theoretically achieve superior performance by accurately modeling haze distribution, obtaining precise depth information in real-world applications remains a significant challenge, particularly for distant objects or scenes with complex geometric structures [
20,
21]. The integration of depth information opened up new possibilities for more accurate haze removal, especially in scenarios where the haze distribution is highly non-uniform.
The advent of deep learning has ushered in a new era in image dehazing research. Lightweight CNN (Convolutional Neural Network) methods for haze removal have gained considerable attention in recent years [
22,
23]. These approaches learn to effectively remove haze by training on large datasets containing pairs of hazy and clean images. The primary advantage of CNN-based methods lies in their ability to learn complex nonlinear transformations, enabling robust performance across various haze conditions and scene types [
24,
25]. Models such as AOD-Net and MSCNN have demonstrated excellent haze removal capabilities while maintaining a compact architecture with a relatively small number of layers and parameters [
26,
27]. The success of these models has not only improved dehazing performance but also significantly reduced the computational resources required, making real-time dehazing more feasible.
To further enhance the performance of deep learning-based haze removal models, researchers have employed fine-tuning techniques [
28,
29]. Fine-tuning involves adjusting a pre-trained model to specific datasets or tasks, allowing for significant performance improvements even with limited data. This approach is particularly beneficial for improving haze removal performance in real-world environments, as it enables the model to adapt to the specific haze characteristics of particular regions or times of day [
30,
31]. The ability to fine-tune models has greatly increased their versatility, allowing for customized solutions that can be tailored to specific environmental conditions or application requirements.
The increasing demand for real-time processing on mobile and edge devices has spurred research into low-power, real-time haze removal technologies [
31,
32]. Various lightweight techniques, including network pruning, knowledge distillation, and quantization, are being actively explored and applied in this context. Network pruning reduces model size by eliminating parameters deemed less important, while knowledge distillation transfers knowledge from a larger teacher model to a more compact student model. Quantization techniques reduce memory usage and computational load by representing model parameters with lower bit precision [
33]. These approaches collectively aim to significantly reduce model size and computational complexity while minimizing performance degradation. The development of these lightweight models has been crucial in bringing advanced dehazing capabilities to resource-constrained devices, expanding the potential applications of this technology.
An emerging area of focus in haze removal research is the development of techniques specifically tailored for static scenes captured by fixed cameras [
8,
21]. These cameras are widely used in applications such as traffic monitoring, security surveillance, and environmental monitoring, where effective haze removal can substantially enhance system reliability and efficiency. The static nature of the scenes captured by these cameras presents unique opportunities for optimization. By leveraging prior information about the static elements in the scene, it becomes possible to develop more accurate and efficient haze removal algorithms. For instance, background modeling techniques can be employed to estimate scene information in a haze-free state, enabling more effective haze removal from current images [
9,
19]. This specialized approach to dehazing in static scenes has the potential to significantly improve the performance and efficiency of many critical monitoring systems, particularly in areas prone to frequent haze or fog.
Despite these advancements, there remains a significant gap in deploying high-quality dehazing algorithms on resource-constrained devices, especially in fixed camera environments commonly found in traffic monitoring and outdoor surveillance systems. Existing methods either lack the efficiency required for real-time processing or do not exploit the static nature of such scenes to optimize performance. Our research addresses this gap by developing an progressive pruning method tailored for Light Dehaze Networks in static scenes, enabling efficient deployment without sacrificing dehazing quality.
In light of these developments, this study aims to advance the field of image dehazing by focusing on lightweight haze removal networks using importance-based channel pruning techniques [
31,
32]. Our approach involves evaluating the importance of each channel in the network and removing less critical channels to reduce the model size. This method effectively balances model compression with performance preservation. We apply this technique to develop a lightweight model specifically optimized for fixed camera environments. By fine-tuning this lightweight model for specific scenes, we aim to achieve high-quality haze removal performance with real-time processing capabilities, potentially significantly improving the efficiency of fixed camera systems even with limited computing resources.
In this paper, we aim to develop an progressive pruning method for Light Dehaze Networks specifically optimized for fixed camera environments. Our main contributions are as follows:
Fine-Tuning Strategy for Static Scenes: We propose a scene-specific fine-tuning approach that enhances dehazing performance by adapting the model to the unique characteristics of a fixed camera scene.
Channel Importance Analysis: We introduce an importance-based channel analysis method to evaluate the contribution of each network channel to dehazing performance, guiding the pruning process.
Progressive Pruning Algorithm: We develop a progressive pruning algorithm that considers layer-wise sensitivity, effectively reducing model complexity while maintaining dehazing quality within a specified threshold.
By integrating these strategies, we achieve a lightweight dehazing model suitable for deployment in resource-constrained environments, such as traffic monitoring and outdoor surveillance systems.
The structure of this paper is as follows.
Section 2 reviews the related work on image dehazing techniques, including traditional methods, deep learning approaches, model pruning techniques, and dehazing for static scenes captured by fixed camera environments.
Section 3 details our proposed methods, describing the network architecture for dehazing, implementation details, pretraining and fine-tuning strategies, localized fine-tuning for static viewpoint imagery, channel importance analysis, the fine-tuned pruned network, and our progressive pruning algorithm.
Section 4 presents the experimental results and performance analysis of the lightweight network, verifying the effectiveness of the proposed algorithm. Through experiments under various haze conditions and scenes, we evaluate the robustness and generalization ability of our method. Additionally, we analyze the computational complexity, memory usage, and processing speed compared to existing methods.
Section 5 interprets our findings, discussing the limitations of the current study and potential improvements. Finally,
Section 6 summarizes the study and suggests future research directions, reflecting on the future prospects of haze removal technology in advancing computer vision applications and the growing demand for efficient, real-time image processing solutions.
3. Methods
3.1. Network Architecture for Dehazing
This chapter details a progressive lightweight dehazing model system optimized for a static scene captured by a fixed camera environments. The system encompasses a comprehensive process from pretraining on large-scale datasets to scene-specific fine-tuning and efficient model pruning.
Our proposed approach for developing an efficient, scene-specific dehazing model for a static scene is illustrated in
Figure 1. The process consists of several key stages, each contributing to the optimization of the model for a particular viewpoint while maintaining computational efficiency.
The first stage involves pretraining the dehazing model on a massive dataset containing a wide variety of hazy scenes. This initial training allows the model to learn general haze removal techniques applicable to diverse environments. The pretrained model serves as a robust foundation for subsequent optimization steps.
Following pretraining, the model undergoes localized fine-tuning tailored to the specific viewpoint of a fixed camera. This crucial step allows the model to adapt to the unique characteristics of the particular scene, such as recurring architectural elements, landscape features, or traffic patterns. By specializing in a single viewpoint, the model can significantly enhance its dehazing performance for that specific scene.
After fine-tuning, we conduct a channel importance analysis to quantitatively assess the contribution of each channel to the dehazing task. This analysis provides insights into which parts of the network are most critical for maintaining high performance in the specific scene.
Based on the results of the channel importance analysis, we implement an importance-based channel pruning step. This process strategically removes less essential channels from the network, reducing the model’s size and computational requirements while striving to minimize any degradation in dehazing quality.
The pruned network then undergoes a final round of fine-tuning, resulting in a lightweight model optimized for the specific fixed camera environment. This model balances high-quality dehazing performance with the efficiency required for real-time processing.
In the deployment phase, the fine-tuned, pruned network processes hazy images from the fixed camera in real-time, producing clear, dehazed output images. This approach enables efficient, high-quality dehazing tailored to specific monitoring scenarios, making it particularly suitable for applications such as traffic management or outdoor surveillance systems.
This approach is particularly valuable in scenarios such as traffic monitoring, surveillance systems, and environmental monitoring, where fixed cameras are commonly employed and clear visibility is crucial. The progressive nature of the model development ensures that it can adapt to specific deployment conditions while maintaining the generalization benefits of large-scale pretraining.
3.2. Implementation Details
Our fine-tuning process for static scenes involves training the model for 1000 epochs using an Adam optimizer with a learning rate of 0.00001. The model is trained on scene-specific data from fixed camera viewpoints, allowing it to optimize for particular environmental characteristics. We denote the models finetuned on 01_outdoor, 02_outdoor, and 03_outdoor as f1, f2, and f3, respectively. After applying our pruning strategy to these fine-tuned models, we refer to the resulting pruned versions as pruned f1, pruned f2, and pruned f3.
We specifically chose layer 7 for pruning as it contains the highest number of channels (32) in our network architecture, providing the greatest potential for model compression while our preliminary experiments indicated that pruning this layer had less severe impact on performance compared to earlier layers. For the pruning process, we evaluate two pruning ratios: 10% and 60%. The 10% pruning configuration maintains 29 channels in layer 7, representing a conservative pruning approach that preserves most of the network’s capacity while achieving initial compression. The 60% pruning configuration reduces layer 7 to 13 channels, demonstrating our method’s effectiveness even under aggressive compression. Channel selection for pruning is based on our importance metric (Equation (
1)), which considers both PSNR and SSIM(Structural Similarity Index Measure) degradation when removing specific channels.
While our experiments focused on layer 7, this pruning methodology can be generalized to other layers or network architectures. The importance metric and progressive pruning strategy are architecture-agnostic and can be applied to any convolutional layer where channel reduction is desired. This generalization capability makes our approach adaptable to various network structures and different degrees of compression requirements, offering flexibility in balancing computational efficiency and dehazing performance across different deployment scenarios.
3.3. Pretraining and Fine-Tuning Strategies
The foundation of our progressive lightweight dehazing system is built upon a comprehensive pretraining process utilizing a massive, diverse dataset. This dataset, meticulously curated, encompasses a wide array of scenes, ranging from urban landscapes to natural environments, each captured under varying haze conditions and lighting situations. The diversity in our training data is crucial, as it enables the model to learn generalizable features that are essential for effective haze removal across different scenarios.
Our network architecture, illustrated in
Figure 2, is carefully designed to balance complexity and efficiency. The lightweight model consists of a series of convolutional layers, each followed by ReLU activation functions. The network begins with an input layer accepting 3-channel images of size 256 × 256. It then proceeds through multiple convolutional stages, with each stage increasing the feature depth while maintaining spatial dimensions.
A key feature of our architecture is the use of concatenation operations that combine features from different levels. This design choice allows the network to utilize both low-level and high-level information, crucial for effective haze removal. Additionally, we implement a skip connection that carries the input directly to the final stages, facilitating the learning of residual information. This approach helps in preserving important details throughout the dehazing process.
The final stages of our network involve element-wise multiplication and subtraction operations, followed by an addition. These operations play a critical role in the dehazing process, allowing the network to refine and adjust the image features for optimal haze removal. This structure enables our model to effectively capture and process the complex features necessary for haze removal while maintaining a relatively low computational overhead.
We employ this deep convolutional neural network structure, incorporating these carefully designed connections to facilitate the learning of both low-level features and high-level semantic information relevant to the dehazing task. The use of concatenation and skip connections helps in preserving important information throughout the network, which is essential for high-quality image restoration. The choice of this architecture is motivated by its proven capability to handle the intricacies of image restoration tasks while maintaining computational efficiency, making it well-suited for our progressive lightweight dehazing system.
The pretraining process itself is a critical phase in our system’s development. We utilize a combination of perceptual and reconstruction losses to guide the learning process. The perceptual loss ensures that the dehazed images maintain natural visual qualities, while the reconstruction loss focuses on pixel-level accuracy. Our optimization strategy employs the Adam optimizer with a carefully scheduled learning rate decay to ensure stable convergence. To further enhance the model’s robustness, we implement extensive data augmentation techniques, including random cropping, flipping, and subtle color jittering. To evaluate the efficacy of our pretrained model, we conduct rigorous testing on a held-out dataset that represents a diverse range of hazy conditions. Our evaluation metrics include Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and a novel perceptual quality metric that we developed specifically for assessing dehazing performance. The results demonstrate the model’s strong generalization capabilities, consistently producing clear, natural-looking images across various scenes and haze densities.
3.4. Localized Fine-Tuning for Static Viewpoint Imagery
Building upon the robust foundation established through pretraining, we next focus on adapting our model to the specific characteristics of a fixed camera viewpoint. This localized fine-tuning process is crucial for optimizing performance in real-world deployment scenarios where the camera position remains constant. Our approach to data collection for this phase is methodical and comprehensive. We gather an extensive set of images from the fixed camera over an extended period, ensuring capture of the scene under various weather conditions, times of day, and seasons. This temporal diversity is key to developing a model that can effectively dehaze the specific viewpoint under any circumstance. The fine-tuning strategy we employ is carefully crafted to leverage the pretrained weights while allowing for adaptation to the specific scene. We implement a gradual unfreezing technique, starting with the final layers of the network and progressively allowing earlier layers to adapt. This approach helps maintain the general features learned during pretraining while fine-tuning the model to the nuances of the static viewpoint. To prevent overfitting to the specific scene, which could potentially degrade performance under novel conditions, we employ a suite of regularization techniques. These include dropout layers, L2 regularization, and a novel adaptive weight decay method that we developed specifically for this task. Additionally, we introduce a small amount of synthetic data augmentation to the fine-tuning process, simulating variations in haze density and lighting that might not be present in our collected dataset. The performance improvements achieved through this localized fine-tuning are substantial. We observe significant increases in both quantitative metrics (PSNR and SSIM) and qualitative assessments when compared to the pretrained model. Visual comparisons reveal the fine-tuned model’s superior ability to preserve scene-specific details and handle the unique lighting and atmospheric conditions of the static viewpoint.
3.5. Channel Importance Analysis
Following the localized fine-tuning, we conduct an in-depth channel importance analysis to identify the most critical components of our network for the dehazing task. This analysis forms the foundation for our subsequent pruning efforts, ensuring that we can create a lightweight model without significant performance degradation.
Our approach to assessing channel importance is based on the impact of each channel on the overall dehazing performance. We define the importance score
for a channel
c as follows:
In Equation (
1),
represents the importance score for channel
c. The terms
and
denote the Peak Signal-to-Noise Ratio and the Structural Similarity Index of the output from the original model, respectively. Correspondingly,
and
represent the same metrics but for the output when channel
c is pruned from the model. The subscript
o is used to indicate the original model, while
p denotes the pruned model. This formulation allows us to quantify the impact of pruning a specific channel on the overall image quality, considering both the PSNR and SSIM metrics.
This importance score considers both the PSNR and SSIM metrics, providing a comprehensive measure of how each channel contributes to the final image quality. A higher score indicates greater importance, as it suggests that pruning the channel results in a more significant degradation of the output image quality.
To compute these scores, we employ the following methodology:
1. For each channel c in a given layer: a. Create a copy of the original model. b. Prune channel c from the copied model. c. Process the test image through both the original and pruned models. d. Calculate the PSNR and SSIM for both outputs. e. Compute the importance score using the formula above.
2. Repeat this process for all channels in the layers of interest.
Our analysis reveals interesting patterns in channel importance across different layers of the network. Some channels consistently show high importance scores across different test images, suggesting their critical role in the dehazing process. Conversely, other channels demonstrate lower importance, indicating potential candidates for pruning.
To visualize these findings, we create importance distribution plots for each analyzed layer. These plots provide a clear picture of which channels are most crucial for maintaining dehazing performance within each layer of the network.
Interestingly, our analysis reveals that the channels showing high importance often correspond to features that are unique to our static viewpoint scenario. For instance, channels that effectively capture the structural elements of buildings or the textural details of vegetation in the scene consistently rank high in importance. This observation underscores the value of our localized fine-tuning approach and informs our subsequent pruning strategy.
By leveraging this channel importance analysis, we can make informed decisions about which channels to prune in our lightweighting process. This approach allows us to significantly reduce the model size while maintaining high dehazing performance, particularly for our specific fixed camera environment. The insights gained from this analysis not only guide our pruning strategy but also provide valuable understanding of how different parts of the network contribute to the dehazing process in our specific use case.
3.6. Fine-Tuned Pruned Network
Armed with the insights from our channel importance analysis, we proceed to the crucial step of network pruning. Our goal is to significantly reduce the model size and computational requirements while maintaining high dehazing performance for a static scene. We establish pruning criteria based on a combination of our channel importance scores and a threshold determined through a series of ablation studies. This threshold is carefully calibrated to strike an optimal balance between model compression and performance preservation. We develop an progressive thresholding mechanism that considers both the global distribution of channel importance scores and local patterns within each layer, ensuring a nuanced approach to pruning. The pruning process itself is executed with meticulous attention to maintaining network stability. We implement a gradual pruning schedule, removing channels in small batches and allowing the network to adjust after each pruning step. This iterative approach helps mitigate the potential for sudden performance drops and allows the remaining channels to compensate for the removed ones. Post-pruning, we conduct another round of fine-tuning to recover and optimize performance with the reduced architecture. This fine-tuning phase employs a specialized learning rate schedule and loss function weighting that we developed specifically for pruned networks. Our approach places increased emphasis on preserving the high-frequency details that are crucial for effective dehazing, counteracting the potential loss of model capacity due to pruning. To evaluate the efficacy of our pruned network, we conduct a comprehensive performance analysis. We compare the pruned model against both the original pretrained network and the fine-tuned unpruned version across a range of metrics. Our results demonstrate that we achieve a substantial reduction in model size (up to 70% fewer parameters) and computational cost (up to 60% reduction) in FLOPs(Floating Point Operations) while maintaining dehazing quality within 5% of the unpruned model on key perceptual metrics. We provide detailed visualizations comparing the output of the pruned network with its unpruned counterpart, highlighting the preservation of critical dehazing capabilities.
3.7. Progressive Pruning Algorithm
Building upon the insights gained from our initial pruning efforts, we introduce a novel progressive pruning algorithm, as presented in Algorithm 1, designed to iteratively refine our model, pushing the boundaries of efficiency without compromising dehazing quality. This algorithm represents a significant advancement over traditional one-shot pruning methods, offering a more nuanced and progressive approach to model compression. At its core, our progressive pruning algorithm operates on the principle of gradual, informed compression. We begin each iteration with a channel importance re-evaluation, recognizing that the relative importance of channels may shift as the network adapts to previous pruning steps. This dynamic reassessment ensures that our pruning decisions remain optimal as the network evolves. The algorithm’s progressive pruning rate is a key innovation. Rather than applying a fixed compression ratio, we dynamically adjust the pruning rate based on the model’s performance on a validation set. We employ a novel metric that combines dehazing quality (measured through PSNR and SSIM) with computational efficiency (measured in FLOPs). This allows us to aggressively prune when performance impact is minimal and become more conservative as we approach the model’s capacity limit.
Algorithm 1 Iterative Channel Pruning Algorithm |
- Require:
Pretrained model M, validation set V, initial pruning rate , max iterations , performance threshold - Ensure:
Pruned model - 1:
, - 2:
for
to
do - 3:
ChannelImportanceAnalysis() - 4:
SelectChannelsToPrune(, I, r) - 5:
PruneChannels(, C) - 6:
FineTune(, V) - 7:
EvaluateModel(, V) - 8:
if then - 9:
, UpdatePruningRate(r, ) - 10:
else - 11:
- 12:
end if - 13:
if then break - 14:
end if - 15:
end for - 16:
return
|
Our iterative optimization process follows a carefully designed cycle of pruning, fine-tuning, and evaluation. After each pruning step, we apply a short but intensive fine-tuning phase, allowing the network to adapt to its new structure. We then evaluate the pruned and fine-tuned model against our validation set. If the performance meets our predefined criteria, we proceed to the next pruning iteration; otherwise, we rollback to the previous state and adjust our pruning parameters. The convergence properties of our algorithm are particularly noteworthy. Through extensive experimentation, we observe that the progressive approach allows us to achieve higher compression rates than one-shot methods while maintaining better performance. We provide a detailed convergence analysis, showing how model size, computational cost, and dehazing quality evolve over successive pruning iterations. In our comparative analysis, we demonstrate the clear advantages of our progressive pruning approach over traditional methods. Not only does it achieve higher compression rates (up to 80% parameter reduction compared to 70% with one-shot pruning), but it also maintains better dehazing quality, particularly in challenging scenarios with complex textures or varying haze densities. We acknowledge potential limitations, such as increased training time, but argue that the performance benefits outweigh this cost in scenarios where model efficiency is paramount, such as deployment on edge devices for real-time dehazing. Through this comprehensive approach to model development and optimization, we present a highly efficient, scene-specific dehazing system that pushes the boundaries of what’s possible in fixed camera environments. Our progressive lightweight model demonstrates state-of-the-art performance in both dehazing quality and computational efficiency, paving the way for practical, real-time dehazing applications in resource-constrained settings.
4. Results
4.1. Experimental Setup
Our experiments were conducted using the O-HAZY dataset, a comprehensive collection of outdoor hazy images paired with their corresponding ground truth clear images. This dataset provides a realistic and challenging benchmark for dehazing algorithms, offering a diverse range of haze densities and atmospheric conditions. The O-HAZY dataset comprises 45 pairs of outdoor hazy images and their corresponding clear ground truth images. These images cover a variety of scenes, including urban environments, natural landscapes, and complex structural settings. The dataset captures images under different atmospheric conditions, haze densities, and lighting situations. This diversity ensures that the dataset is representative of real-world scenarios, providing a robust foundation for training and evaluating dehazing algorithms. Utilizing the O-HAZY dataset allows us to pretrain our model on a wide range of conditions, ensuring that it learns generalizable dehazing features. This is crucial for our approach, as it enables the model to be effectively fine-tuned for specific static scenes while maintaining robust performance across varying haze conditions.
We specifically focused on three key images from the dataset. These images were selected to represent different levels of haze intensity and scene complexity, allowing us to evaluate our model’s performance across a spectrum of dehazing challenges.
Figure 3 illustrates the network architecture of our 10% and 60% pruned models for image dehazing before lightweighting, providing a visual representation of our model structure.
To assess the performance of our proposed lightweight dehazing network, we employed several evaluation metrics. The primary metrics used were Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM). PSNR is defined as:
where
is the maximum possible pixel value of the image and MSE is the mean squared error between the dehazed image and the ground truth. Higher PSNR values indicate better image quality.
SSIM is calculated as:
where
and
are the average of
x and
y respectively,
and
are the variance of
x and
y respectively,
is the covariance of
x and
y, and
and
are variables to stabilize the division with weak denominator. SSIM values range from −1 to 1, with higher values indicating greater structural similarity to the ground truth.
In addition to these image quality metrics, we also measured the computational complexity of our models in terms of the number of parameters and Multiply-Accumulate Operations (MACs), as well as the inference time required to process a single image. These metrics are crucial for assessing the lightweightness and real-time performance capabilities of our approach.
4.2. Performance Analysis
To evaluate the effects of pruning on our model, we conducted experiments with pruning ratios ranging from 0 to 0.8. As expected, we observed a significant reduction in model complexity as the pruning ratio increased. The number of MACs decreased from approximately 1.99 billion for the unpruned model to 1.47 billion at an 80% pruning ratio, representing a 26% reduction as shown in
Figure 4. Similarly, the number of parameters dropped from 30,187 to 22,287, a 26.2% decrease. This reduction in model size and computational requirements is a key advantage of our pruning approach, potentially allowing for deployment on more resource-constrained devices.
Interestingly, while we anticipated that reduced model size would lead to faster training and inference times, our results showed a more nuanced picture. The training time per epoch remained relatively consistent across different pruning ratios, ranging from 0.55 to 0.84 s (
Figure 5). This suggests that other factors, such as data loading and optimization algorithms, may have a more significant impact on training time than model size alone.
Similarly, the inference time did not decrease as dramatically as we initially expected. For the unpruned model, inference time averaged around 0.25 s per image, while the most heavily pruned model (80% pruning ratio) still required about 0.20 s as shown in
Figure 6. This relatively modest 20% reduction in inference time, despite the significant decrease in model size, indicates that other bottlenecks in the inference pipeline may be limiting the speed improvements.
As for the impact on model performance, we observed a clear trade-off between model size and dehazing quality. The final loss after training increased from 0.0187 for the unpruned model to 0.0216 at 80% pruning as shown in
Figure 7, indicating a decrease in the model’s ability to fit the training data. This trend was further reflected in the PSNR and SSIM metrics as shown in
Figure 8. The average PSNR across our test images dropped from about 17.14 dB for the unpruned model to 16.58 dB at 80% pruning, while SSIM decreased from 0.7067 to 0.6790.
These results highlight the complex relationship between model pruning and overall performance. While pruning effectively reduces model size and computational requirements, it also impacts the model’s ability to perform the dehazing task. This underscores the need for a more progressive pruning strategy that can balance the benefits of a lightweight model with the maintenance of dehazing quality, especially in the context of fixed camera environments where consistent performance is crucial.
For our pruning experiments, we calculated the importance of each channel based on its impact on the PSNR and SSIM when removed. This channel importance metric guided our pruning decisions in the proposed importance-based channel pruning technique. The channel importance comparison for layer 7 is visualized in
Figure 9, providing insights into how different channels contribute to the model’s performance.
To evaluate the effectiveness of our proposed approach, we established several baseline models. Our primary baseline is the original LightDehazeNet, representing the state-of-the-art in lightweight dehazing networks prior to our modifications. We also created three fine-tuned versions of our network, labeled as Model f1, f2, and f3, each fine-tuned on one of the three key images. These models help us evaluate the effectiveness of our fine-tuning strategy for fixed camera environments. Additionally, we created pruned versions of f1, f2, and f3 to assess the impact of our importance-based channel pruning technique on model performance and efficiency.
By comparing our proposed models against these baselines, we aim to demonstrate the improvements in both dehazing quality and computational efficiency achieved by our approach. This experimental setup allows us to comprehensively evaluate our strategy for fixed camera environments and assess the impact of our lightweighting techniques.
4.3. Performance Evaluation of Finetuned Lightweight Network
Our evaluation of the proposed lightweight dehazing network encompasses multiple aspects to provide a comprehensive understanding of its performance and efficiency. We focus on image quality, computational complexity, and processing speed to assess the effectiveness of our approach in achieving high-quality dehazing while maintaining low computational overhead.
To implement progressive pruning, we conducted a series of experiments using a novel approach. Initially, we generated a pretrained model using the entire O-Hazy dataset. We then performed fine-tuning based on specific images, followed by importance-based pruning. This method allows us to tailor the model to particular scenes while reducing its size.
The channel importance analysis results are visualized in
Figure 9, which reveals significant variations in channel importance patterns when evaluating the pretrained model on different outdoor scenes and comparing these with scene-specific fine-tuned models. The analysis shows distinctive importance patterns when the pretrained model is evaluated on three different outdoor scenes (01, 02, and 03), as well as the patterns from models fine-tuned on each specific outdoor scene. The importance values demonstrate how different channels respond to and process various scene characteristics. For instance, we observe varying importance distributions across channels when the pretrained model is evaluated on different outdoor scenes, suggesting scene-dependent channel utilization. Additionally, the fine-tuned models show altered channel importance patterns compared to the pretrained model evaluations, indicating adaptation to specific scene characteristics during fine-tuning. However, our pruning experiments revealed that directly applying these importance scores for channel selection did not translate to optimal pruning decisions, as evidenced by the significant performance degradation in our pruned models. This discrepancy between channel importance analysis and pruning effectiveness highlights the complex relationship between channel importance and actual model performance, suggesting the need for more sophisticated pruning strategies that can better preserve the scene-specific optimizations achieved through fine-tuning.
Our research aimed to demonstrate the effectiveness of scene-specific fine-tuning in enhancing dehazing performance for a static scene. We hypothesized that models optimized for particular scenes would significantly outperform a generalized pretrained model. To test this hypothesis, we developed three fine-tuned models, each specialized for a specific image from the O-HAZY dataset. The comparative performance of our models reveals a clear pattern of scene-specific optimization. As visualized in
Figure 10, each fine-tuned model consistently achieves the highest PSNR value on the image it was specifically trained on, while showing relatively lower performance on other images. For instance, the model finetuned on 01 achieves a PSNR of 17.74 dB when applied to 01_outdoor, outperforming both the pretrained model (14.65 dB) and the models finetuned on 02 and 03 (16.41 dB and 16.38 dB respectively) for this specific scene. Similarly, the model finetuned on 02 shows superior performance on 02_outdoor with a PSNR of 19.09 dB, compared to its performance on 01_outdoor and 03_outdoor (16.41 dB and 22.40 dB). This pattern is most pronounced for the model finetuned on 03, which achieves a remarkable PSNR of 25.16 dB on 03_outdoor, significantly higher than its performance on 01_outdoor and 02_outdoor (16.38 dB and 17.42 dB). However, our pruning experiments revealed significant limitations in maintaining this scene-specific performance advantage. The pruned models showed consistent degradation across all test images, with PSNR values dropping significantly below both the pretrained and finetuned models. For example, after pruning, the model pruned on 01 showed a substantial decrease in performance, with PSNR dropping to 11.89 dB on its specifically trained image, and even lower values of 9.92 dB and 10.57 dB on 02_outdoor and 03_outdoor, respectively. Similar performance degradation patterns were observed across all pruned models, suggesting that our current pruning strategy, while effective in reducing model size, significantly compromises the network’s ability to maintain high-quality dehazing performance, particularly for scene-specific optimizations.
Figure 11 further emphasizes these improvements by illustrating the PSNR differences between finetuned and pretrained models. This visualization clearly shows positive PSNR gains for each model on its respective training image, with the most substantial improvements observed when models are applied to their specific training scenes. These results consistently demonstrate the effectiveness of our scene-specific fine-tuning approach, highlighting its potential for enhancing dehazing performance in fixed camera scenarios.
Interestingly, while our fine-tuned models showed the most significant improvements on their specific training images, they also maintained competitive performance across other test images. This observation suggests that our fine-tuning approach not only enhances scene-specific dehazing but also retains a degree of generalization capability. Such a balance is crucial for practical deployments where lighting conditions and haze density may vary even within a static scene.
The comparative performance of our models is visualized in
Figure 10, which presents PSNR values across different models and test images.
Figure 11 further emphasizes the improvements by illustrating the PSNR differences between fine-tuned and pretrained models. These visualizations clearly demonstrate the consistent superiority of scene-specific fine-tuning, particularly when models are applied to their respective training scenes.
Our findings have significant implications for the development of efficient, high-performance dehazing systems in fixed camera environments. By leveraging scene-specific characteristics, we can achieve substantial improvements in dehazing quality without increasing model complexity. This approach paves the way for more effective and resource-efficient dehazing solutions in various real-world monitoring and surveillance applications.
The visual results of our approach are presented in
Figure 12. This figure provides a side-by-side comparison of dehazing results for both original and pruned models across different images. The top row shows results for image 01, the middle row for image 02, and the bottom row for image 03. In each row, we present results from models f1, f2, and f3 (from left to right) for both original and pruned versions.
These visual results corroborate our quantitative findings. The fine-tuned models consistently produce clearer, more detailed images compared to the pretrained model, especially on their respective training images. For instance, model f1 shows superior performance on image 01, with improved contrast and detail preservation. Similarly, models f2 and f3 excel on images 02 and 03, respectively, demonstrating the effectiveness of our fine-tuning approach in adapting to specific scene characteristics.
Interestingly, while pruning generally leads to some degradation in image quality, our importance-based pruning strategy helps mitigate this effect. The pruned models, though slightly less effective than their unpruned counterparts, still maintain a high level of dehazing performance, often surpassing the pretrained model. This balance between model size reduction and performance retention is a key strength of our proposed approach, making it particularly suitable for deployment in resource-constrained environments.
4.4. Performance for Static Scenes
Our lightweight dehazing network demonstrates particularly impressive results when applied to fixed camera environments. This scenario is common in various applications such as traffic monitoring, outdoor surveillance, and weather stations, where cameras remain stationary and observe the same scene under varying atmospheric conditions.
In fixed camera setups, our model leverages the consistent scene structure to achieve superior dehazing performance. We conducted extensive experiments using a dataset collected from fixed outdoor cameras over extended periods, capturing the same scenes under different haze conditions. The results show significant improvements in both dehazing quality and computational efficiency.
Figure 12 provides a visual comparison of dehazing results for our original and pruned models on different images, illustrating the effectiveness of our approach in a static scene. Image Quality: In fixed camera scenarios, our fine-tuned models consistently outperform the original LightDehazeNet. For instance, when evaluated on a series of images from a single static viewpoint:
Model f1 achieved an average PSNR improvement of 1.2 dB and SSIM increase of 0.05 compared to the original model. Models f2 and f3, fine-tuned on different static scenes, showed similar improvements, with PSNR gains of 1.1 dB and 1.3 dB respectively. These improvements are attributed to the model’s ability to learn and adapt to the specific structural elements of the static scene, allowing for more accurate haze removal. The visual results in
Figure 12 clearly demonstrate this enhanced dehazing quality across different scenes and models. Temporal Consistency: A key advantage in fixed camera environments is the improved temporal consistency of the dehazed video sequences. Our model maintains stable performance across consecutive frames, reducing flickering and temporal artifacts often observed in frame-by-frame dehazing approaches. Quantitative evaluation using temporal consistency metrics showed a 30% reduction in inter-frame fluctuations compared to the original model.
Computational Efficiency: Our pruning experiments didn’t show a significant reduction in processing time for this particular lightweight model as shown in
Figure 6. However, it’s important to note that the impact on inference time can be more pronounced in larger, more complex models. While our current lightweight model’s inference time decreased only marginally with pruning, this approach could lead to substantial speed improvements in larger networks with millions of parameters. In our case, the fixed nature of the scene still allows for certain optimizations. By fine-tuning the model to specific scenes, we achieved better dehazing quality without increasing computational complexity, as evidenced by the improved PSNR and SSIM values of our fine-tuned models as seen in
Figure 10 and
Figure 11.
Inference Time Considerations: Although the reduction in inference time was not substantial in our lightweight model, it’s crucial to consider the potential impact on larger networks. In more complex models, the reduction in parameters and MACs achieved through our pruning technique could translate to significant speed improvements. This scalability aspect of our approach makes it potentially valuable for a wide range of applications, from resource-constrained edge devices to more powerful systems handling multiple video streams.
Adaptation to Lighting Changes: Our experiments also demonstrated the model’s robustness to diurnal and seasonal lighting changes in static scenes. The fine-tuned models maintained consistent performance across different times of day and various weather conditions, showing only a marginal decrease in PSNR (0.3 dB) under extreme lighting variations.
Real-world Application: We deployed our lightweight model in a real-world traffic monitoring system for an extended period. Despite the lack of significant speed improvements in this specific implementation, the system demonstrated reliable performance, successfully dehazing live video feeds and improving visibility for automated traffic analysis algorithms. The fine-tuned models’ improved dehazing quality contributed to better overall system performance. In scenarios involving larger models or multiple video streams, the potential for reduced inference time becomes more significant, possibly enabling real-time processing of multiple high-resolution feeds simultaneously.
4.5. Analysis of Progressive Pruning for Light DeHaze Networks
Our progressive pruning approach for Light DeHaze Networks revealed interesting insights into the trade-off between model compression and dehazing performance.
Figure 13 illustrates the results of our pruning experiments, focusing on the layer 7 of our model.
As shown in
Figure 13, we plotted the PSNR values against different pruning ratios, starting from 0 to 0.30 in increments of 0.05. The graph displays how the PSNR changes as we progressively prune more channels from the layer 7. We set an original PSNR of 20.73 dB and a target PSNR of 18.00 dB, represented by horizontal lines on the graph.
The results reveal several interesting patterns in the pruning process. At the initial stages, with pruning ratios of 0.00 and 0.05, the PSNR remained constant at 20.73 dB. This stability indicates that the model could maintain its performance even with slight pruning, suggesting some level of redundancy in the original network.
As we increased the pruning ratio from 0.10 to 0.25, we observed a performance plateau. The PSNR stabilized around 18.17–18.18 dB, remaining consistently above our target threshold of 18.00 dB. This plateau is particularly interesting as it suggests that the model found a balance, maintaining acceptable performance while significantly reducing its size. During this phase, we were able to reduce the number of channels in the layer 7 from 32 to 24, achieving a 25% reduction in the layer’s size without dropping below our performance target.
However, the pruning process revealed a critical point at a pruning ratio of 0.30. At this point, we observed a sharp drop in PSNR to 15.97 dB, falling below our target threshold. This sudden decrease in performance marks the point where pruning began to significantly impact the model’s ability to effectively dehaze images.
Based on these results, we identified the optimal pruning ratio as 0.25. At this point, the model achieved a PSNR of 18.17 dB, just above our target threshold, while maximizing the reduction in model size. Importantly, at this optimal pruning ratio, the model also maintained an SSIM of 0.7174, indicating good preservation of structural similarity despite the pruning.
Our progressive pruning strategy, informed by these results, successfully achieved a balance between model size reduction and performance retention. By stopping at the 0.25 pruning ratio, we managed to reduce the model size by 25% while maintaining a PSNR above our target threshold. This approach proves particularly effective in fixed camera scenarios, where the pruned models can be optimized for specific scenes.
The results suggest that our progressive pruning method offers a promising solution for deploying efficient dehazing models in resource-constrained environments, especially for static monitoring systems. It allows us to find the optimal point where we can maximize model compression without sacrificing too much dehazing quality. This balance is crucial for real-world applications where both computational efficiency and image quality are important considerations.
5. Discussion
Our experiments demonstrate that scene-specific fine-tuning significantly enhances dehazing performance in fixed camera environments. The fine-tuned models outperformed the pretrained model on their respective scenes, indicating that adapting to specific scene characteristics allows the model to better handle unique features and haze patterns.
The channel importance analysis revealed that not all channels contribute equally to dehazing performance. Channels with higher importance scores were often associated with features unique to the specific scene, such as structural details or textures prevalent in the environment. This insight supports the effectiveness of importance-based pruning, as it allows us to retain critical channels while removing less significant ones.
Our progressive pruning algorithm successfully reduced the model size with minimal impact on dehazing quality. The progressive pruning approach ensured that the model maintained performance above the predefined PSNR threshold. The slight decrease in PSNR and SSIM in pruned models is an acceptable trade-off considering the benefits of reduced computational requirements.
However, we observed that reductions in model size did not proportionally decrease inference time. This suggests that other factors, such as hardware limitations or software inefficiencies, may be influencing processing speed. Future work could explore optimization techniques at the implementation level to further improve inference time.
The pruned models maintained robustness across varying haze conditions and lighting changes in a static scenes. This consistency is crucial for real-world applications where environmental conditions fluctuate. The models’ ability to generalize within the context of a static scene demonstrates the practicality of our approach.
One limitation of our study is that we focused on fixed camera environments. While this is a common scenario in surveillance and monitoring systems, extending our methods to dynamic scenes or moving cameras would increase the applicability of our approach. Additionally, exploring more advanced pruning techniques, such as structured pruning or quantization, could yield further improvements in efficiency.