Deep Dynamic Weights for Underwater Image Restoration

Awan, Hafiz Shakeel Ahmad; Mahmood, Muhammad Tariq

doi:10.3390/jmse12071208

Open AccessArticle

Deep Dynamic Weights for Underwater Image Restoration

by

Hafiz Shakeel Ahmad Awan

and

Muhammad Tariq Mahmood

^*

Future Convergence Engineering, School of Computer Science and Engineering, Korea University of Technology and Education, 1600 Chungjeolro, Byeongcheonmyeon, Cheonan 31253, Republic of Korea

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2024, 12(7), 1208; https://doi.org/10.3390/jmse12071208

Submission received: 15 June 2024 / Revised: 11 July 2024 / Accepted: 17 July 2024 / Published: 18 July 2024

(This article belongs to the Special Issue Application of Deep Learning in Underwater Image Processing)

Download

Browse Figures

Versions Notes

Abstract

:

Underwater imaging presents unique challenges, notably color distortions and reduced contrast due to light attenuation and scattering. Most underwater image enhancement methods first use linear transformations for color compensation and then enhance the image. We observed that linear transformation for color compensation is not suitable for certain images. For such images, non-linear mapping is a better choice. This paper introduces a unique underwater image restoration approach leveraging a streamlined convolutional neural network (CNN) for dynamic weight learning for linear and non-linear mapping. In the first phase, a classifier is applied that classifies the input images as Type I or Type II. In the second phase, we use the Deep Line Model (DLM) for Type-I images and the Deep Curve Model (DCM) for Type-II images. For mapping an input image to an output image, the DLM creatively combines color compensation and contrast adjustment in a single step and uses deep lines for transformation, whereas the DCM employs higher-order curves. Both models utilize lightweight neural networks that learn per-pixel dynamic weights based on the input image’s characteristics. Comprehensive evaluations on benchmark datasets using metrics like peak signal-to-noise ratio (PSNR) and root mean square error (RMSE) affirm our method’s effectiveness in accurately restoring underwater images, outperforming existing techniques.

Keywords:

underwater images; underwater image restoration; underwater image enhancement; color restoration; lightweight network; deep learning

1. Introduction

Underwater imaging plays an important role in ocean observation and marine engineering applications. However, underwater images suffer from several artifacts. While capturing underwater images, a considerable portion of the light is absorbed during its propagation in the water, resulting in color distortion [1]. Moreover, backward–forward light scattering severely affects the contrast and details of images, which further deteriorates the performance of underwater industrial applications [2]. Therefore, underwater image enhancement—addressing color restoration, enhancing contrast, and improving details—is an essential task in marine engineering and observation applications.

In the literature, many methods have been proposed for improving underwater image quality. These methods can be broadly categorized into prior-based, imaging-based, and machine-deep learning-based techniques [2,3,4,5]. Generally, prior-based methods heavily depend on hand-crafted priors and excel in dehazing outdoor images. However, their performance is less than satisfactory for underwater images, and they struggle to correctly manage color shifts. Although imaging-based approaches significantly improve the color and contrast of underwater images, they often overlook the specificities of underwater imaging models. This oversight can result in over-enhanced or over-saturated final images. Machine/deep learning methods provide better results; however, they usually suffer from generalization problems. We observed that most of the methods from all three categories behave differently for different input images. It means that their performance depends on the characteristics of input images. They may work well on some images, while they may not provide good results for other images. Therefore, it is important to study the characteristics of the input images.

In this study, we propose a method for underwater image restoration that employs linear or non-linear mapping depending on the type of the input image. First, an input image is classified as Type I or Type II. Then, Type-I images are enhanced using the Deep Line Model (DLM), while the Deep Curve Model (DCM) is employed for Type-II images. The DLM effectively integrates color compensation and contrast adjustment in a unified process, utilizing deep lines for transformation, whereas the DCM is focused on applying higher-order curves for image enhancement. Both models utilize lightweight neural networks that learn per-pixel dynamic weights based on the input image’s characteristics. The main contributions of the paper are summarized below:

We observed that color components of the degraded underwater images have linear and non-linear relationships among them. So, images are classified as Type I or Type II. Different treatment is suggested for different types of images, and it yields better results.
The Deep Line Model (DLM) is proposed for input images having linear relationships among their color components. As the color components have a linear relationship, pixels can be improved using a linear (line) model, whereas the DLM learns the parameters of the line for each pixel.
The Deep Curve Model (DCM) is proposed for images having non-linear relationships among their color components. As the color components have a non-linear relationship, pixels may not be improved using a line model. In this case, a curve is more appropriate and effective in improving the color components. Thus, the DCM learns the parameters of the curve for each pixel.

The efficacy of the proposed solution is measured by conducting experiments on benchmark datasets and using quantitative metrics: the peak signal-to-noise ratio (PSNR) and root mean square error (RMSE). The comparative analysis affirms our method’s effectiveness in accurately restoring underwater images, outperforming existing techniques.

2. Related Work

2.1. Underwater Physical Imaging Model

The underwater image formation model (IFM) also known as the atmospheric scattering model (ASM) is depicted in Figure 1. It can be seen that three types of lights are received by the image-capturing device: (1) the reflected light that comes to the camera directly after striking the object, (2) the forward-scattered light that deviates from the original direction after striking the object, (3) the back-scattering light that comes to the camera after encountering particles. The attenuation of light depends both on the distance of the device to the object and the light’s wavelengths and is affected by seasonal, geographic, and climate variations. These factors, in turn, severely affect the quality of the captured images, and the image restoration become a challenging task. There are various image formation methods that describe the formation of images in scattering media, but we adopt the model proposed in Schechner and Kopeika [6] in this work. According to this model, the intensity of the image in each color channel

c \in {R, G, B}

at each pixel is composed of two components: the attenuated signal, which represents the amount of light absorbed by the underwater medium, and the veiling light, which represents the light scattered by the medium.

I_{c} (x) = J_{c} (x) t_{c} (x) + (1 - t_{c} (x)) \cdot A_{c},

(1)

where bold denotes vectors,

x

is the pixel coordinate,

I_{c} (x)

is the acquired image value in color channel c,

t_{c} (x)

is the transmission of that color channel, and

J_{c} (x)

is the object radiance. The global veiling-light component

A_{c}

is the scene value in areas with no objects (

t_{c} = 0, \forall c \in {R, G, B}

). In homogeneous media, the transmission map (TM) can be described by

t_{c} (x) = e^{- β_{c} z (x)}

, where

β

is the medium attenuation coefficient and

z (x)

is the depth. The primary objective of underwater image restoration methods is to restore the original image

J (x)

with corrected colors from the observed and degraded image

I_{c} (x)

.

2.2. Underwater Restoration Techniques

Underwater image restoration techniques can broadly be categorized into prior-based, imaging-based, and machine-deep learning-based techniques. Prior-based methods utilize the underwater image formation model (IFM) and draw priors from the degraded images. Initially, a transmission map (TM) is derived from priors such as the dark-channel prior (DCP) [3], red-channel prior (RCP) [7], medium-channel prior (MDP) [8], and haze-line prior (NLD) [9]. Subsequently, the image is restored using the IFM, which is equipped with the TM and atmospheric light. In [10], a red-channel prior (RCP)-guided variational framework is introduced to enhance the TM, and the image is restored utilizing the IFM. In contrast to the prior-based methods, imaging-based methods do not utilize the IFM. Instead, they rely on foundational image enhancement techniques such as contrast enhancement, histogram equalization, image fusion, and depth estimation. Peng et al. [2] proposed a depth estimation technique for underwater scenes that relies on image blurriness and light absorption. This depth information is fed into the IFM to restore and enhance the underwater visuals. In another study by Ancuti et al. [11], a combined approach of color compensation and white balancing is applied to the original degraded image to restore its clarity. Zhang et al. [12] introduced a strategy guided by the minimum color loss principle and maximum attenuation map to adjust for color shifts. In another recent work by Zhang et al. [13], a Retinex-inspired color correction mechanism is employed to eliminate color cast. The research further incorporates both local and global contrast-enhanced versions of the image to refine the color output.

On the other hand, deep learning methods are mainly divided into ASM-based and non-ASM-based techniques. ASM-based methods use the atmospheric scattering model (ASM) to clear up hazy images. For instance, DehazeNet [14] by Cai et al. applies a deep architectural approach to estimate transmission maps, generating clear images. Similarly, MSCNN [15] by Ren et al. uses a multi-scale network to learn the mapping between hazy images and their corresponding transmission maps. AOD-Net [16] by Li et al. directly creates clear images with a lightweight CNN, and DCPDN [17] by Zhang et al. leverages an innovative network architecture, focusing on multi-level pyramid pooling to optimize the dehazing performance. In contrast, non-ASM-based methods rely on various network designs to transform hazy images directly into clear ones through various structures like the encoder–decoder, GAN-based, attention-based, knowledge transfer, and transformer-based networks. Encoder–decoder structures like the gated fusion network (GFN) by Ren et al. [18] and Gated Context Aggregation Network (GCANet) by Chen et al. [19] utilize multiple inputs and dilated convolutions to effectively reduce halo effects and enhance feature extraction. GAN-based networks such as Cycle-Dehaze by Engin et al. [20] and BPPNet by Singh et al. [21] offer unpaired training processes and are capable of learning multiple complexities, thereby yielding high-quality dehazing results even with minimal training datasets. Attention-based networks like GridDehazeNet by Liu et al. [22] and FFA-Net by Qin et al. [23] implement adaptive and attention-based techniques, providing more flexibility and efficiently dealing with non-homogeneous haze. Knowledge transfer methods like KTDN by Wu et al. [5] leverage teacher–student networks, enhancing performance in non-homogeneous haze conditions by transferring the robust knowledge acquired by the teacher network. In [24], to tackle the problems of low contrast, color distortion and poor visual appearance, a sequence of operations such as white balancing, gamma correction, sharpening, and manipulating weight maps are performed on the input image. In [25], a CycleGAN-based network is proposed that uses the U-Net structure in the generator part, as the long skip connection of U-Net will obtain more detailed information. The pixel-level attention block is appended in the network for detail structure modeling. Transformer-based networks like DehazeFormer by Song et al. [26] make significant modifications in traditional structures and employ innovative techniques like SoftReLU and RescaleNorm, presenting better performance in dehazing tasks with efficient computational cost and parameter utilization. In a more recent deep learning-based method [27], a style transfer network is used to synthesize underwater images from clear images. Then, an underwater image enhancement network with a U-shaped convolutional variational autoencoder is constructed for underwater image restoration. In another work [28], a physical imaging-based model is proposed that includes a multi-scale progressive enhancement module to enrich the image details and a chromatic aberration correction mechanism for color balance. In [29], an underwater image enhancement scheme is proposed that incorporates domain adaptation. Firstly, an underwater dataset fitting model (UDFM) is developed for merging degraded datasets. Then, an underwater image enhancement model (UIEM) is suggested for image enhancement. In a more recent work [30], we propose a multi-task fusion where fusion weights are obtained from the similarity measures. Fusion based on such weights provides better image enhancement and restoration capabilities.

3. Motivation

We conducted an experiment using 11,950 images with ground truths (GTs) from two well-known datasets: EUVP [31] and UIEBD [32], as well as various methods from the literature: ACT [33], AOD [16], DNet [14], FGAN [31], MMLE [12], NLD [9], RLP [34], SCNet [35], UDCP [36], and UNTV [10]. First, we computed the root mean square error (RMSE) and peak signal-to-noise ratio (PSNR) using the input images and their GTs. Then, these methods were applied to the 11,950 images, and the metrics RMSE and PSNR were computed using the restored images and their GTs. We then compared the metric values before and after applying the methods to the images. After counting the number of images with improved and declined metrics, the results were unexpected. Figure 2 shows the number of images that have improved and declined metrics. It can be observed that a larger number of images, after applying the restoration methods, have not improved the metrics RMSE and PSNR. This indicates that the performance of the methods depends on the characteristics of the input images. They may work well on some images, while they may not provide expected results for others. Therefore, it is important to study the characteristics of the input images. For this purpose, we investigate the relationships between the color components of the input images.

Let

I_{c} (x)

represent a degraded underwater color image, where

x = (x, y)

denotes the coordinates of the image pixels and

c \in {r, g, b}

signifies the red, green, and blue color channels, respectively. The color components of the image can thus be denoted as

{I_{r} (x), I_{g} (x), I_{b} (x)}

. In underwater imaging, differential color attenuation across wavelengths frequently leads to compromised visual fidelity, predominantly impacting the red channel while leaving the green comparatively unaltered [11]. Conventional restoration techniques typically adopt a sequential approach: initial color correction to balance channel disparities, followed by linear enhancement methods such as contrast stretching to mitigate the attenuation effects.

In the literature, many methods use the mean values from each channel for color compensation [11,37,38,39]. This approach is grounded in the Gray World assumption, which suggests that all channels should exhibit equal mean intensities in an undistorted image [40], leading to a straightforward approach for color compensation:

\{\begin{matrix} I_{r} (x) = I_{r} (x) + ({\bar{I}}_{g} (x) - {\bar{I}}_{r} (x)) \cdot (1 - I_{r} (x) \cdot I_{g} (x)), \\ I_{g} (x) = I_{g} (x), \\ I_{b} (x) = I_{b} (x) + ({\bar{I}}_{g} (x) - {\bar{I}}_{b} (x)) \cdot (1 - I_{b} (x) \cdot I_{g} (x)) . \end{matrix}

(2)

where

{\bar{I}}_{r} (x)

,

{\bar{I}}_{g} (x)

, and

{\bar{I}}_{b} (x)

denote the mean values of the degraded color components of the underwater image

I_{c} (x)

.

Although additive adjustments can compensate for color distortions in red and blue channels, our study reveals that this compensation may worsen the color composition in many cases, leading to inferior quality in restored images. As demonstrated in Figure 3, two distinct outcomes are observed: Type-I images benefit from color correction, with spectral intensities approaching the ground truth, enhancing visual quality. Conversely, Type-II images experience worsened color discrepancies, resulting in suboptimal restoration. This necessitates a dual restoration approach. Our method uses a classifier to categorize images, which is followed by the application of the DLM for Type-I images and the DCM for Type-II images. This strategy ensures precise, adaptive restoration aligned with the specific requirements of each image category.

4. Proposed Method

The proposed methodology restores images through a two-phase process. Initially, an image classifier categorizes each image as either Type I or Type II. Subsequently, Type-I images are processed using the Deep Line Model (DLM), while Type-II images undergo enhancement through the Deep Curve Model (DCM). The complete framework depicted in Figure 4 showcases the complete process, from initial classification to the final output, highlighting the effectiveness and adaptability of both the DLM and DCM in underwater image restoration.

4.1. Image Classifier

Based on the observations in images’ profile intensity and metrics, images have been categorized into two distinct types. Type-I images are those that retain or improve in quality following linear transformation for color compensation, while Type-II images are characterized by a decline in quality after the same transformation. Following this classification, we applied linear transformations and computed metrics such as RMSE and PSNR and labeled the images accordingly. For the classification task, a small neural network is designed that takes the features obtained through DenseNet121 [41] and analyzes the intrinsic properties of an input image and guides it toward the most appropriate restoration pathway—the DLM for Type-I images and the DCM for Type-II images—thereby enhancing feature extraction and image restoration. The subsequent sections delve into the detailed frameworks of each model.

4.2. Deep Line Model

Mostly, underwater image enhancement methods are executed in two linear operations. Initially, a color-corrected image

{\overset{´}{I}}_{c} (x)

is obtained by compensating the channels through additive adjustment factors [11,42]. The generalized form of this operation is as follows:

{\overset{´}{I}}_{c} (x) = I_{c} (x) + η_{c} (x),

(3)

where

η_{c}

represents the additive adjustment factors for each channel

c \in {r, g, b}

. In the second step, the restored image

{\hat{I}}_{c} (x)

is obtained by improving contrast. It is achieved by applying another linear operation that stretches the pixel values of the color-corrected image.

{\hat{I}}_{c} (x) = α_{c} \cdot {\overset{´}{I}}_{c} (x) + β_{c},

(4)

where

α_{c} and β_{c}

, are constants utilized to represent weights for each channel that are mostly applied globally.

Instead of performing two separate linear operations and using global weights, we suggest a Deep Line Model that combines two steps and uses per-pixel weights. The proposed model is expressed as

{\hat{I}}_{c} (x) = α_{c} (x) \cdot I_{c} (x) + β_{c} (x) \cdot η_{c} (x),

(5)

where

α_{c} (x) and β_{c} (x)

are weight matrices which are learned through the deep network and

η_{c}

represents the color compensation factors for each channel

c \in {r, g, b}

. The color compensation factors

η_{c}

are computed by using the mean guided compensations [11] through the following expressions.

\{\begin{matrix} η_{r} = ({\bar{I}}_{g} (x) - {\bar{I}}_{r} (x)) (1 - I_{r} (x) I_{g} (x)), \\ η_{g} = 0, \\ η_{b} = ({\bar{I}}_{g} (x) - {\bar{I}}_{b} (x)) (1 - I_{b} (x) I_{g} (x)) . \end{matrix}

(6)

where

{\bar{I}}_{r} (x)

,

{\bar{I}}_{g} (x)

, and

{\bar{I}}_{b} (x)

denote the mean values of degraded color components of underwater image

I_{c} (x)

.

Now, the computational challenge exists in the derivation of dynamic weight matrices. To overcome this, we employ a lightweight deep neural network, as illustrated in Figure 4b. The network’s architecture encompasses seven convolutional layers; the first six are equipped with 16 filters each, kernel

3 \times 3

and utilizing ReLU activation functions, while the seventh layer adopts a

T a n h

activation to produce the required weight matrices. This setup is specifically designed to facilitate localized adjustments, empowering the deep line model delineated in Equation (5) to competently address the complex characteristics of underwater images and effectuate precise, adaptive enhancements. Figure 4c exemplifies this capability, depicting the generic behavior of deep lines with varying random parameters

α, β

, spanning a range from −1 to 1. Our model is not only effective but also efficient, comprising a mere 18,390 trainable parameters and requiring only 54.07 MB of memory, rendering it an optimal solution for resource-constrained systems.

4.3. Deep Curve Model

The images that are not restored through linear operations require non-linear transformations. Inspired by the work [43] for low light image enhancement, we propose a Deep Curve Model for underwater image restoration. A second-order polynomial is a simple non-linear mapping between an input image

I_{c} (x)

and the output image

{\hat{I}}_{c} (x)

, which is also differential.

{\hat{I}}_{c} (x) = γ_{c, 1} (x) \cdot {(I_{c} (x))}^{2} + γ_{c, 2} (x) \cdot I_{c} (x) + γ_{c, 3} (x),

(7)

where

γ_{c, 1} (x)

,

γ_{c, 2} (x)

, and

γ_{c, 3} (x)

are pixel-wise coefficients for each channel

c \in {r, g, b}

. By setting

γ_{c, 1} (x) = γ_{c, 2} (x)

, and

γ_{c, 3} (x) = z e r o s (x)

, the non-linear mapping can be simplified and re-written as

{\hat{I}}_{c} (x) = I_{c} (x) + γ_{c} (x) \cdot I_{c} (x) \cdot (1 - I_{c} (x))

(8)

where

γ_{c} (x)

represents the weight matrices for the non-linear mapping (curves) for each channel

c \in {r, g, b}

. It means that the curves are applied separately to each of the three RGB channels, allowing for better restoration by preserving the inherent color and by reducing the risk of over-saturation. Furthermore, the image is restored by applying mapping inside the network, so it should be differentiable for forward and backward propagation. While the second-order curves can provide satisfactory restoration results, they can further be improved by applying higher-order curves. One simple way to achieve higher-order mapping is to apply a second-order mapping iterative fashion. The iterative version of the deep curve model can be expressed as

{\hat{I}}_{c}^{(n)} (x) = {\hat{I}}_{c}^{(n - 1)} (x) + γ_{c}^{(n - 1)} (x) \cdot {\hat{I}}_{c}^{(n - 1)} (x) \cdot (1 - {\hat{I}}_{c}^{(n - 1)} (x)),

(9)

where n indicates the iteration number and

γ_{c}^{(n - 1)} (x)

represents the weight matrices for the

(n - 1) t h

iteration. For

n = 1

,

{\hat{I}}_{c}^{(1)} (x)

is computed through (7). Furthermore, for n iterations,

n \times 3

weight matrices are required. In this work, we have set the eight, i.e.,

n \in {1, 2, \dots, 8}

, so

(8 \times 3 = 24)

dynamic weight matrices need to be learned. Now, the problem is how to compute dynamic weight matrices.

To learn the weight matrices (curve parameter maps), we adopted a technique similar to that used in the deep line model discussed in the previous section. A lightweight deep neural network, as shown in Figure 4e, is employed to compute these dynamic weight matrices. The network takes the input image

I_{c} (x)

and learns a set of pixel-wise curve parameter maps corresponding to higher-order curves. The behavior of such curves is illustrated in Figure 4f, for instance,

λ_{3}

, with

λ_{1}

and

λ_{2}

set to −1 and the number of iterations n equal to 3, showcasing the advanced adjustment capabilities with these curves. The network’s architecture comprises seven convolutional layers with the first six layers each containing 32 kernels of size

3 \times 3

with a stride of 1, which are followed by a ReLU activation function to introduce non-linearity into the model. The final layer consists of 24 convolutional kernels of the same size and stride but employs a

T a n h

activation function, ensuring that the output values are constrained within the range of −1 to 1. This layer produces a total of 24 curve parameter maps (dynamic weight matrices) across eight iterations with each iteration providing three curve parameter maps for each channel

c \in {r, g, b}

.

5. Results and Discussion

5.1. Datasets

In this study, we utilized two primary datasets: EUVP [31] and UIEBD [32]. Both datasets comprise subsets containing paired and unpaired images. We aggregated paired images from subsets of EUVP, including underwater_dark, underwater_scenes, and underwater_imagenet, resulting in a combined total of

11,950

images. These images were used for both training and testing within our proposed method. Similarly, from the UIEBD dataset, we selected 890 images from the raw-890 subset for the same purpose. For performance evaluation, we used the test_samples subsets from both EUVP and UIEBD, which consist of 515 and 240 images, respectively.

5.2. Evaluation Metrics

In assessing the quality of restored images, we adopted widely recognized metrics such as RMSE and PSNR [44]. RMSE was calculated as follows:

R M S E = \sqrt{\frac{1}{M N} \sum_{x = 1}^{M} \sum_{y = 1}^{N} {(I_{c}^{'} (x, y) - {\hat{I}}_{c} (x, y))}^{2}}

(10)

where M and N represent the dimensions of the images,

I_{c}^{'} (x, y)

is the color-corrected input image, and

I_{c}^{'} (x, y)

is the output image. A lower value of RMSE indicates better results. In addition to RMSE, the peak signal-to-noise ratio (PSNR) was used as an evaluation metric during the validation phase. PSNR is a standard measure for assessing the quality of reconstructed images in comparison with the original ones. PSNR was calculated using the formula:

P S N R = 20 \cdot {log}_{10} (\frac{M A X_{I}}{\sqrt{M S E}})

(11)

where

M A X_{I}

is the maximum possible pixel value of the image, and MSE represents the mean squared error, which is calculated as the squared difference between

I_{c}^{'} (x, y)

and

{\hat{I}}_{c} (x, y)

averaged over all of the pixels. A higher value of PSNR represented better quality.

5.3. Implementation

In our implementation, we constructed the framework using PyTorch and executed on an NVIDIA GeForce RTX3090 GPU. The proposed method integrates three core models: (1) Image Classifier, (2) DLM, and (3) DCM. Image Classifier differentiates between Type-I and Type-II images. For the training of Image Classifier, first, we applied color compensation through (2) on 11,434 images and labeled them with 1 if the image is improved with respect to the input image; otherwise, its label was set to 0. From a total of 11,434 images, 9000 images were used for training and 2434 images were utilized for testing. The training process also utilized a learning rate of

1 \times 10^{- 4}

and a batch size of 2 and 100 epochs. The training accuracy obtained was 99.01%, whereas accuracy on test data was noted as 84.18%. The DLM targets the enhancement of Type-I images with configuration settings such as batch sizes of 64 and 80 epochs for EUVP and batch sizes of 2 and 200 epochs for UIEBD; the DCM focuses on the restoration of Type-II images. This model adopts optimizer configurations similar to the DLM but with a specific batch size of 2 and 100 epochs for EUVP and solely 200 epochs for UIEBD. Throughout their training phases, both the DLM and DCM models employ the RMSE (L2) as a loss function. Using the assembled dataset, a classifier was trained to categorize the images into two types: Type I and Type II. This classification was carried out on

11,435

images from EUVP and 890 images from UIEBD. As a result, 758 images from EUVP and 240 images from UIEBD were identified as Type II, while the remaining images were designated as Type I. The DLM was then trained on

10,677

Type-I images from EUVP and 500 images from UIEBD. In contrast, the DCM was trained using 758 Type-II images from EUVP and 140 Type-II images from UIEBD. All models utilized the Adam optimizer to ensure efficient convergence. For both training and evaluation, we processed images with dimensions

256 \times 256

. Importantly, our settings incorporated gradient clipping, normalized to

0.1

, to prevent gradient explosions, along with a weight decay of

0.0001

to provide regularization.

5.4. Comparative Analysis

In order to rigorously assess the efficacy of our proposed approach, we contrasted it against leading-edge methods in the domain. This encompasses non-learning techniques such as NLD [9], RLP [34], MMLE [12], and UNTV [10], as well as learning-driven paradigms including ACT [33], FGAN [31], DNet [14], AOD [16], and SCNet [35]. First, input images are restored through the above-mentioned methods and the proposed method, and then restoration accuracy is compared by utilizing the widely accepted quantitative metrics, root mean square error (RMSE) and peak signal-to-noise ratio (PSNR). A lower value for RMSE and a higher value for PSNR indicate better results. Table 1 presents the quantitative results for test samples across both datasets. The performance of the proposed networks for the EUVP dataset is denoted as “Ours”. Focusing on the EUVP dataset, a quick examination of Table 1 reveals that the DLM exhibits superior performance in Type-I images, registering an RMSE of 0.08 and a PSNR of 22.30 dB, which outperforms state-of-the-art (SOTA) methods. Similarly, for Type-II images, the DCM surpasses competing methods by achieving 0.06 RMSE and 25.00 dB PSNR. Notably, when contrasted with SCNet and ACT—the models boasting the second-best performance for Type-I and Type-II images, respectively—the DLM excels with a differential of 0.02 in RMSE and 2.3 dB in PSNR, whereas the DCM showcases a marked improvement with 0.87 RMSE and 4.4 dB PSNR. Transitioning to the UIEBD dataset, SCNet leads in performance for Type-I images, registering a 0.07 RMSE and 23.60 dB PSNR. This is contrasted with the DLM, which holds the position for the second-best performance, achieving 0.11 RMSE and 20.10 dB PSNR. Conversely, for Type-II images, the DCM stands out by attaining 0.08 RMSE and 22.20 dB PSNR. When compared with SCNet—the runner-up in performance—our DCM establishes a superior benchmark with a difference of 0.02 in RMSE and an elevation of 1.5 dB in PSNR.

To assess the robustness of the proposed method for qualitative evaluation, we selected six images from each dataset of Type-I and Type-II categories and the visual outcomes from established approaches and the proposed method shown in Figure 5. It can be observed from the figure that methods such as MMLE tend to over-darken certain areas in their dehazing results. Additionally, techniques like NLD, RLP, and UNTV exhibit notable color distortions and texture degradation. Meanwhile, ACT, DNet, and AOD-Net struggle with the remaining haze effects. FGAN’s outcomes lean excessively toward reddish tones, and while SCNet shows an improvement over previous methods, especially for Type-I images, it still presents slight color distortions and tends to produce overly bright images. Remarkably, ACT performs commendably on Type-II images when compared to other competitors. In contrast, our proposed method excels by restoring finer details and achieves more visually appealing restorations, outperforming both traditional and learning-based counterparts.

Similarly, to further verify the effectiveness of our approach, we extended our comparison to the six randomly selected images from the UIEBD dataset, representing both Type-I and Type-II categories. The restored images for the comparative methods along with the proposed method are presented in Figure 6. It can be observed from the figure that prevalent methods such as UDCP, ACT, D-Net, and AOD-Net struggle with lasting haze. MMLE and UNTV, in particular, introduce significant color distortions, to preserve texture details and edge sharpness, whereas F-GAN tends to bias the image restoration toward a reddish tint. RLP and SCNet offer superior visual clarity compared to their counterparts, yet their dehazed images exhibit excessive brightness when compared to our model. In contrast, our method not only restores natural colors and sharp edges but also excels in processing Type-II images, consistently surpassing both conventional and learning-based algorithms.

Furthermore, Table 2 and Table 3 enumerate the quantitative results for images depicted in Figure 5 and Figure 6. For both datasets, the performance of the proposed method is indicated as “Ours”. In examining the EUVP dataset for Type-I images, the DLM consistently showcases superior performance across all images, registering a minimum RMSE of 0.04 and a PSNR of 28.30dB. Similarly, for Type-II images, the DCM emerges as the top performer across all images, notching a minimum RMSE of 0.03 and a PSNR of 29.90 dB. Transitioning to the UIEBD dataset, for Type-I images, the DLM stands out in performance for two out of three images, achieving a minimum RMSE of 0.05 and a PSNR of 26.90 dB. For Type-II images, the DCM exhibits top-tier performance in 2 out of 3 images, recording a minimum RMSE of 0.04 and a PSNR of 29 dB. In a comprehensive analysis, both the DLM and DCM prove their efficacy across both datasets, outperforming in 10 out of 12 images. In comparison to other methods, with RLP being the second-best performer, it excelled in 4 out of 12 images.

5.5. Ablation Study

The proposed method was tested using 515 images from the EUVP dataset and 240 images from the UIEBD dataset. The proposed classifier designated 486 to Type I and 29 images to Type II out of 515 images from the EUVP dataset, setting them up as the testing sets for the DLM and DCM, respectively. Similarly, from the UIEBD dataset, test images classified 213 images as Type I and 27 images as Type II. All datasets were provided to each of the proposed DLM and DCM for the restoration. RMSE and PSNR metrics were computed for each image in the datasets, and the average values of the PSNR and RMSE are shown in Table 4. The DLM is designed for refining Type-I images and the DCM is developed for the restoration of Type-II images. From the table, it can be observed that when models are applied on Type-I and Type-II images, respectively, improved PSNR and RMSE measures are obtained. Whereas, declined or marginally improved metrics are obtained when models are applied on Type-II and Type-I images, respectively. Hence, the DLM and DCM are effective for the restoration of Type-I and Type-II images, respectively.

In addition, for qualitative analysis of the DLM and DCM, we selected two images from each type, and their restored versions are shown in Figure 7. From the analysis of the figure, it is evident that the DLM performs better for Type-I images in both datasets, yielding restored images that are closer to the ground truth (GT). However, when the DCM is applied to Type-I images, although certain areas appear clearer, there is a tendency for colors to become denser. For instance, the blue color intensifies, resulting in a more bluish appearance of the image. Conversely, Type-II images restored using the DCM for both datasets exhibit a cleaner look and are more closely aligned with the GT, whereas the DLM shows suboptimal performance, either causing color distortion or producing blurry images.

In our detailed examination of the Deep Curve Model (DCM) performance, we observed a consistent enhancement in the restoration quality with increasing iterations. Utilizing two representative images from the UIEBD dataset, as illustrated in Figure 8, we record the progression of quality improvements. For instance, as shown in Figure 8b, the restoration quality at the fourth iteration (I4) manifests a considerable enhancement from the input, which is evidenced by the RMSE/PSNR values of 0.09/20.95. This trend of enhancement persists through iterations I4 and I6, as highlighted by the corresponding RMSE/PSNR figures, which signify the improved clarity and overall quality of the dehazed images. The peak of visual clarity is achieved at iteration I8, registering the lowest RMSE of 0.06 and the highest PSNR of 24.66 for the top image, with the bottom image exhibiting similarly positive metrics. Notably, limiting the model to merely one or two iterations does not invariably lead to inferior dehazing outcomes; in some cases, the image quality may remain stable or even slightly enhance, suggesting that the optimal number of iterations for image enhancement is variable. The correlation between the iteration count and the quality of dehazing is evident with higher iterations yielding superior visual and quantitative results.

5.6. Complexity of Models

In order to estimate the time taken by different models, we run all three model classifiers, DML, and DCM turn by turn on the system X64 bit-based PC having an 11th generation Intel Core i9-11900K processor with 32 GB RAM for 2700 images and calculated the processing time. Table 5 shows the numerical values for time in seconds for all three models. As expected, the classifier is based on DenseNet121, which has much more trainable parameters. For each Type-1 image, approximately 0.0155 s were consumed, whereas for the processing of a Type-II image, 0.0161 s time was taken. So, it can also be observed that the classifier is taking a larger portion of the time. The DML and DCM are light-weight models, so they consume a tiny portion of the total time per image.

5.7. Limitations and Future Work

Although the proposed framework typically yields improved restoration outcomes, its efficacy can be compromised by the misclassification of images. Such wrong classifications are more likely when images possess mixed characteristics or when the changes in their characteristics are slight. These occurrences underline the need to refine the classifier’s accuracy to ensure dependable performance in real-world applications. An improved classifier can be designed by cultivating better image labeling procedures and exploiting deep image features. Moreover, in this study, we divided the images into two types, which may also not be optimal. It is anticipated that dividing images into various categories and developing models according to the characteristics of the images will further improve the results. In addition, a thorough study about the other performance metrics for input and restored images is required for further investigation. Such a study not only will help in better understanding the problem of underwater image restoration problem but also will be helpful in designing better solutions.

6. Conclusions

In this paper, we presented an underwater image restoration solution that initially, categorizes input images into Type I or Type II. Afterward, based on the classification, the DLM is applied to restore Type-I images, while the DCM is used for the restoration of Type-II images. Both models utilize lightweight neural networks for learning per-pixel weight matrices based on the input image’s characteristics. The efficacy of the proposed solution is measured by conducting experiments on benchmark datasets and using quantitative metrics PSNR and RMSE. Experimental results and comparative analysis demonstrate the efficacy of the proposed method.

Author Contributions

Conceptualization, H.S.A.A. and M.T.M.; methodology, M.T.M.; software, H.S.A.A.; validation, H.S.A.A. and M.T.M.; formal analysis, H.S.A.A.; investigation, H.S.A.A.; resources, M.T.M.; data curation, H.S.A.A.; writing—original draft preparation, H.S.A.A.; writing—review and editing, M.T.M.; visualization, H.S.A.A.; supervision, M.T.M.; project administration, M.T.M.; funding acquisition, M.T.M. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Education and Research Promotion Program of KOREATECH (2024).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets used for this study are publicly available through the link https://irvlab.cs.umn.edu/resources/euvp-dataset (accessed on 16 July 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ASM	atmospheric scattering model
CNN	convolutional neural network
DCP	dark-channel prior
DCM	Deep Curve Model
DLM	Deep Line Model
GFN	gated fusion network
IFM	image formation model
MCP	medium-channel prior
PSNR	peak signal-to-noise ratio
RCP	red-channel prior
RMSE	root mean square error
TM	transmission map
UIEM	underwater image enhancement model

References

Akkaynak, D.; Treibitz, T.; Shlesinger, T.; Loya, Y.; Tamir, R.; Iluz, D. What is the space of attenuation coefficients in underwater computer vision? In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4931–4940. [Google Scholar]
Peng, Y.T.; Cosman, P.C. Underwater image restoration based on image blurriness and light absorption. IEEE Trans. Image Process. 2017, 26, 1579–1594. [Google Scholar] [CrossRef] [PubMed]
He, K.; Sun, J.; Tang, X. Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 2341–2353. [Google Scholar] [PubMed]
Raveendran, S.; Patil, M.D.; Birajdar, G.K. Underwater image enhancement: A comprehensive review, recent trends, challenges and applications. Artif. Intell. Rev. 2021, 54, 5413–5467. [Google Scholar] [CrossRef]
Wu, H.; Liu, J.; Xie, Y.; Qu, Y.; Ma, L. Knowledge transfer dehazing network for nonhomogeneous dehazing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Virtual, 14–19 June 2020; pp. 478–479. [Google Scholar]
Schechner, Y.Y.; Karpel, N. Recovery of underwater visibility and structure by polarization analysis. IEEE J. Ocean. Eng. 2005, 30, 570–587. [Google Scholar] [CrossRef]
Galdran, A.; Pardo, D.; Picón, A.; Alvarez-Gila, A. Automatic Red-Channel underwater image restoration. J. Vis. Commun. Image Represent. 2015, 26, 132–145. [Google Scholar] [CrossRef]
Gibson, K.B.; Vo, D.T.; Nguyen, T.Q. An investigation of dehazing effects on image and video coding. IEEE Trans. Image Process. 2011, 21, 662–673. [Google Scholar] [CrossRef]
Berman, D.; Levy, D.; Avidan, S.; Treibitz, T. Underwater single image color restoration using haze-lines and a new quantitative dataset. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 2822–2837. [Google Scholar] [CrossRef]
Xie, J.; Hou, G.; Wang, G.; Pan, Z. A Variational Framework for Underwater Image Dehazing and Deblurring. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 3514–3526. [Google Scholar] [CrossRef]
Ancuti, C.O.; Ancuti, C.; De Vleeschouwer, C.; Bekaert, P. Color balance and fusion for underwater image enhancement. IEEE Trans. Image Process. 2017, 27, 379–393. [Google Scholar] [CrossRef]
Zhang, W.; Zhuang, P.; Sun, H.H.; Li, G.; Kwong, S.; Li, C. Underwater image enhancement via minimal color loss and locally adaptive contrast enhancement. IEEE Trans. Image Process. 2022, 31, 3997–4010. [Google Scholar] [CrossRef]
Zhang, W.; Dong, L.; Xu, W. Retinex-inspired color correction and detail preserved fusion for underwater image enhancement. Comput. Electron. Agric. 2022, 192, 106585. [Google Scholar] [CrossRef]
Cai, B.; Xu, X.; Jia, K.; Qing, C.; Tao, D. Dehazenet: An End-to-End System for Single Image Haze Removal. IEEE Trans. Image Process. 2016, 25, 5187–5198. [Google Scholar] [CrossRef]
Ren, W.; Liu, S.; Zhang, H.; Pan, J.; Cao, X.; Yang, M.H. Single image dehazing via multi-scale convolutional neural networks. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part II 14. Springer: Cham, Switzerland, 2016; pp. 154–169. [Google Scholar]
Li, B.; Peng, X.; Wang, Z.; Xu, J.; Feng, D. Aod-Net: All-in-One Dehazing Network. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
Zhang, H.; Patel, V.M. Densely connected pyramid dehazing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3194–3203. [Google Scholar]
Ren, W.; Ma, L.; Zhang, J.; Pan, J.; Cao, X.; Liu, W.; Yang, M.H. Gated fusion network for single image dehazing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3253–3261. [Google Scholar]
Chen, D.; He, M.; Fan, Q.; Liao, J.; Zhang, L.; Hou, D.; Yuan, L.; Hua, G. Gated context aggregation network for image dehazing and deraining. In Proceedings of the 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, HI, USA, 7–11 January 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1375–1383. [Google Scholar]
Engin, D.; Genç, A.; Kemal Ekenel, H. Cycle-dehaze: Enhanced cyclegan for single image dehazing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 825–833. [Google Scholar]
Singh, A.; Bhave, A.; Prasad, D.K. Single image dehazing for a variety of haze scenarios using back projected pyramid network. In Proceedings of the Computer Vision—ECCV 2020 Workshops, Glasgow, UK, 23–28 August 2020; Proceedings, Part IV 16. Springer: Cham, Switzerland, 2020; pp. 166–181. [Google Scholar]
Liu, X.; Ma, Y.; Shi, Z.; Chen, J. Griddehazenet: Attention-based multi-scale network for image dehazing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 7314–7323. [Google Scholar]
Qin, X.; Wang, Z.; Bai, Y.; Xie, X.; Jia, H. FFA-Net: Feature fusion attention network for single image dehazing. In Proceedings of the AAAI conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 11908–11915. [Google Scholar]
Mohan, S.; Simon, P. Underwater image enhancement based on histogram manipulation and multiscale fusion. Procedia Comput. Sci. 2020, 171, 941–950. [Google Scholar] [CrossRef]
Wang, Z.; Liu, W.; Wang, Y.; Liu, B. Agcyclegan: Attention-guided cyclegan for single underwater image restoration. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23–27 May 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 2779–2783. [Google Scholar]
Song, Y.; He, Z.; Qian, H.; Du, X. Vision transformers for single image dehazing. IEEE Trans. Image Process. 2023, 32, 1927–1941. [Google Scholar] [CrossRef] [PubMed]
Wang, Z.; Zhang, K.; Yang, Z.; Da, Z.; Huang, S.; Wang, P. Underwater Image Enhancement Based on Improved U-Net Convolutional Neural Network. In Proceedings of the 2023 IEEE 18th Conference on Industrial Electronics and Applications (ICIEA), Ningbo, China, 18–22 August 2023; pp. 1902–1908. [Google Scholar]
Yang, J.; Li, C.; Li, X. Underwater image restoration with light-aware progressive network. In Proceedings of the ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar]
Deng, X.; Liu, T.; He, S.; Xiao, X.; Li, P.; Gu, Y. An underwater image enhancement model for domain adaptation. Front. Mar. Sci. 2023, 10, 1138013. [Google Scholar] [CrossRef]
Liao, K.; Peng, X. Underwater image enhancement using multi-task fusion. PLoS ONE 2024, 19, e0299110. [Google Scholar] [CrossRef] [PubMed]
Islam, M.J.; Xia, Y.; Sattar, J. Fast Underwater Image Enhancement for Improved Visual Perception. IEEE Robot. Autom. Lett. 2020, 5, 3227–3234. [Google Scholar] [CrossRef]
Li, C.; Guo, C.; Ren, W.; Cong, R.; Hou, J.; Kwong, S.; Tao, D. An Underwater Image Enhancement Benchmark Dataset and Beyond. IEEE Trans. Image Process. 2020, 29, 4376–4389. [Google Scholar] [CrossRef]
Yang, H.H.; Fu, Y. Wavelet U-Net and the Chromatic Adaptation Transform for Single Image Dehazing. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019. [Google Scholar]
Ju, M.; Ding, C.; Guo, C.A.; Ren, W.; Tao, D. IDRLP: Image dehazing using region line prior. IEEE Trans. Image Process. 2021, 30, 9043–9057. [Google Scholar] [CrossRef] [PubMed]
Fu, Z.; Lin, X.; Wang, W.; Huang, Y.; Ding, X. Underwater image enhancement via learning water type desensitized representations. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23–27 May 2022; pp. 2764–2768. [Google Scholar]
Drews, P.; Nascimento, E.; Moraes, F.; Botelho, S.; Campos, M. Transmission estimation in underwater single images. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Sydney, Australia, 2–8 December 2013; pp. 825–830. [Google Scholar]
Ancuti, C.O.; Ancuti, C.; De Vleeschouwer, C.; Sbetr, M. Color channel transfer for image dehazing. IEEE Signal Process. Lett. 2019, 26, 1413–1417. [Google Scholar] [CrossRef]
Liu, C.; Shu, X.; Pan, L.; Shi, J.; Han, B. MultiScale Underwater Image Enhancement in RGB and HSV Color Spaces. IEEE Trans. Instrum. Meas. 2023, 72, 5021814. [Google Scholar] [CrossRef]
Liang, Z.; Zhang, W.; Ruan, R.; Zhuang, P.; Xie, X.; Li, C. Underwater Image Quality Improvement via Color, Detail, and Contrast Restoration. IEEE Trans. Circuits Syst. Video Technol. 2023, 34, 1726–1742. [Google Scholar] [CrossRef]
Buchsbaum, G. A spatial processor model for object colour perception. J. Frankl. Inst. 1980, 310, 1–26. [Google Scholar] [CrossRef]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
Ebner, M. Color Constancy; John Wiley & Sons: Hoboken, NJ, USA, 2007; Volume 7. [Google Scholar]
Guo, C.; Li, C.; Guo, J.; Loy, C.C.; Hou, J.; Kwong, S.; Cong, R. Zero-reference deep curve estimation for low-light image enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 1780–1789. [Google Scholar]
Hore, A.; Ziou, D. Image quality metrics: PSNR vs. SSIM. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 2366–2369. [Google Scholar]

Figure 1. Formation of underwater images. The direct transmission of light contains valuable information about the scene, while the backscattered light degrades the image by reflecting off the suspended particles in the water column. The distance between the camera and the object, denoted by z, affects the clarity of the image. Red light is absorbed more quickly than other wavelengths, making it less effective for underwater imaging. Additionally, forward scattered light can blur the scene, further reducing image quality.

Figure 2. Methods including ACT [33], AOD [16], DNet [14], FGAN [31], MMLE [12], NLD [9], RLP [34], SCNet [35], UDCP [36], and UNTV [10] are applied on 11,950 images. The numbers and their percentages are shown for the number of images where the metrics RMSE and PSNR are improved and declined, respectively.

Figure 3. The effect of color compensation operation on two different types of images. For Type I, color compensation aligns colors closer to the ground truth (GT), enhancing visual fidelity. In contrast, for Type II, the compensation results in color distortion when compared to GT.

Figure 4. Overview of the proposed framework illustrating adaptive mapping capabilities of the models: (a) input images with intensity profiles, (b) architecture of the Deep Line Model (DLM), (c) examples of lines varying with parameters

α, β

, (d) output images processed by the DLM, (e) architecture of the Deep Curve Model (DCM), (f) examples of curves demonstrating higher-order adjustments, (g) output images processed by the DCM.

Figure 4. Overview of the proposed framework illustrating adaptive mapping capabilities of the models: (a) input images with intensity profiles, (b) architecture of the Deep Line Model (DLM), (c) examples of lines varying with parameters

α, β

, (d) output images processed by the DLM, (e) architecture of the Deep Curve Model (DCM), (f) examples of curves demonstrating higher-order adjustments, (g) output images processed by the DCM.

Figure 5. Visual enhancements of the EUVP dataset utilizing non-learning and learning techniques, inclusive of our proposed method. Refer to Table 2 and Table 3 for associated RMSE and PSNR values.

Figure 6. Visual enhancements on the UIEBD dataset utilizing non-learning and learning techniques, inclusive of our proposed method. Refer to Table 2 and Table 3 for associated RMSE and PSNR values.

Figure 7. Ablation study evaluating the DLM and DCM for underwater image restoration. For Type-I images, the DLM achieves RMSE/PSNR values of 0.05/26.73, while the DCM yields 0.05/26.14 for Type-II images, and 0.06/24.80 in the UIEBD dataset, compared to the DLM’s 0.16/16.10. Red boxes indicate qualitative differences between the models.

Figure 8. Ablation study of the effect of different iterations. The RMSE/PSNR for the input and the corresponding iterations are written beneath each subfigure.

Table 1. Comparison of average RMSE and PSNR for non-learning and learning-based methods on EUVP and UIEBD test images. The best results are in bold, and the second-best results are underlined.

				Non-Learning-Based Methods				Learning-Based Methods
Measure	Dataset	Type	Input	NLD	RLP	MMLE	UNTV	ACT	DNet	AOD	FGAN	SCNet	Ours
RMSE	EUVP	Type-I	0.11	0.15	0.15	0.18	0.14	0.84	0.12	0.78	0.73	0.10	0.08
		Type-II	0.08	0.12	0.15	0.17	0.11	0.93	0.10	0.87	0.84	0.11	0.06
	UIEBD	Type-I	0.15	0.18	0.14	0.21	0.16	0.81	0.17	0.75	0.70	0.07	0.11
		Type-II	0.14	0.12	0.11	0.24	0.17	0.92	0.14	0.86	0.53	0.10	0.08
PSNR	EUVP	Type-I	20.00	17.00	17.30	15.30	17.40	17.50	19.10	15.10	13.70	20.00	22.30
		Type-II	22.00	18.40	16.90	15.60	19.20	20.60	20.10	13.30	15.50	19.70	25.00
	UIEBD	Type-I	17.60	15.30	17.50	14.30	16.20	15.50	15.90	15.10	12.70	23.60	20.10
		Type-II	18.60	18.70	19.20	12.60	15.60	19.70	17.80	13.10	9.99	20.70	22.20

Table 2. Comparison of average PSNR for non-learning and learning-based methods on EUVP and UIEBD test images. The best results are in bold, and the second-best results are underlined.

			Non-Learning-Based Methods				Learning-Based Methods
Dataset	Image	Input	NLD	RLP	MMLE	UNTV	ACT	D-Net	AOD	F-GAN	SCNet	Ours
EUVP	test_p84_	19.30	15.70	15.10	17.33	17.52	17.00	18.90	15.16	15.50	15.62	26.70
Type-I	test_p404_	17.20	15.00	16.00	18.79	15.96	14.99	17.10	15.24	14.60	13.96	28.30
	test_p510_	22.70	21.10	22.20	17.48	20.77	20.11	20.40	13.73	13.60	17.55	24.50
EUVP	test_p171_	23.70	20.70	19.60	15.61	20.38	22.04	19.90	12.13	15.20	20.22	27.70
Type-II	test_p255_	26.50	18.90	8.45	10.45	18.61	24.88	26.40	15.73	13.50	24.27	29.90
	test_p327_	23.70	20.20	16.20	15.68	20.24	21.18	20.50	13.44	15.00	18.99	26.10
UIEBD	375_img_	20.80	19.50	16.80	16.54	20.10	19.40	19.30	15.01	15.40	15.57	22.90
Type-I	495_img_	18.10	15.40	14.50	15.11	17.51	14.62	17.20	13.94	14.00	13.80	27.00
	619_img_	22.70	21.10	22.20	17.49	20.78	20.12	20.41	13.74	13.61	17.56	24.51
UIEBD	746_img_	23.70	20.70	19.60	15.62	20.39	22.05	19.91	12.14	15.21	20.23	27.71
Type-II	845_img_	26.50	18.90	8.46	10.46	18.62	24.89	26.41	15.74	13.51	24.28	29.91
	967_img_	23.70	20.20	16.21	15.69	20.25	21.19	20.51	13.45	15.01	19.00	26.11

Table 3. Comparison of average RMSE for non-learning and learning-based methods on EUVP and UIEBD test images. The best results are in bold, and the second-best results are underlined.

			Non-Learning-Based Methods				Learning-Based Methods
Dataset	Image	Input	NLD	RLP	MMLE	UNTV	ACT	DNet	AOD	FGAN	SCNet	Ours
EUVP	test_p84_	0.11	0.16	0.20	0.14	0.13	0.14	0.11	0.18	0.17	0.17	0.05
Type-I	test_p404_	0.14	0.18	0.20	0.11	0.16	0.18	0.14	0.17	0.19	0.20	0.04
	test_p510_	0.07	0.09	0.10	0.13	0.09	0.10	0.10	0.21	0.21	0.13	0.06
EUVP	test_p171_	0.07	0.09	0.10	0.17	0.10	0.08	0.10	0.25	0.17	0.10	0.04
Type-II	test_p255_	0.05	0.11	0.40	0.30	0.12	0.06	0.05	0.16	0.21	0.06	0.03
	test_p327_	0.07	0.10	0.20	0.16	0.10	0.09	0.09	0.21	0.18	0.11	0.05
UIEBD	375_img_	0.09	0.11	0.14	0.15	0.10	0.11	0.11	0.18	0.17	0.17	0.07
Type-I	495_img_	0.14	0.20	0.20	0.18	0.14	0.19	0.14	0.20	0.20	0.20	0.04
	619_img_	0.07	0.09	0.10	0.13	0.09	0.10	0.10	0.21	0.21	0.13	0.06
UIEBD	746_img_	0.07	0.09	0.10	0.17	0.10	0.08	0.10	0.25	0.17	0.10	0.04
Type-II	845_img_	0.05	0.11	0.40	0.30	0.12	0.06	0.05	0.16	0.21	0.06	0.03
	967_img_	0.07	0.10	0.20	0.16	0.10	0.09	0.09	0.21	0.18	0.11	0.05

Table 4. Quantitative results of ablation study comparing the DLM and DCM. Average RMSE/PSNR for Type-I and Type-II images. (The best values are highlighted).

Datasets	Group	Input	DLM	DCM
EUVP	Type-I	0.11/20.00	0.93/22.35	0.89/20.63
EUVP	Type-II	0.08/22.00	0.97/24.18	0.97/25.03
UIEBD	Type-I	0.15/17.60	0.93/20.08	0.85/16.37
UIEBD	Type-II	0.14/18.60	0.94/17.81	0.95/22.20

Table 5. Time taken by different models used in the proposed solution (in seconds).

Number of Images	Classifier	DLM	DCM
2700	37.82	4.11	5.84
1	0.0140	0.0015	0.0021

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Awan, H.S.A.; Mahmood, M.T. Deep Dynamic Weights for Underwater Image Restoration. J. Mar. Sci. Eng. 2024, 12, 1208. https://doi.org/10.3390/jmse12071208

AMA Style

Awan HSA, Mahmood MT. Deep Dynamic Weights for Underwater Image Restoration. Journal of Marine Science and Engineering. 2024; 12(7):1208. https://doi.org/10.3390/jmse12071208

Chicago/Turabian Style

Awan, Hafiz Shakeel Ahmad, and Muhammad Tariq Mahmood. 2024. "Deep Dynamic Weights for Underwater Image Restoration" Journal of Marine Science and Engineering 12, no. 7: 1208. https://doi.org/10.3390/jmse12071208

APA Style

Awan, H. S. A., & Mahmood, M. T. (2024). Deep Dynamic Weights for Underwater Image Restoration. Journal of Marine Science and Engineering, 12(7), 1208. https://doi.org/10.3390/jmse12071208

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Dynamic Weights for Underwater Image Restoration

Abstract

1. Introduction

2. Related Work

2.1. Underwater Physical Imaging Model

2.2. Underwater Restoration Techniques

3. Motivation

4. Proposed Method

4.1. Image Classifier

4.2. Deep Line Model

4.3. Deep Curve Model

5. Results and Discussion

5.1. Datasets

5.2. Evaluation Metrics

5.3. Implementation

5.4. Comparative Analysis

5.5. Ablation Study

5.6. Complexity of Models

5.7. Limitations and Future Work

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI