1. Introduction
With the development of underwater imaging technology, underwater target recognition has been widely used in topographic survey and geomorphological observation [
1]. In natural static water, the scattering and absorption characteristics of suspended particles are the main factors causing the degradation of an underwater image [
2], limiting the underwater visible range to only a few tens of meters. Former studies focused on the degradation model, solving beam transmission and scattering problems to improve the image quality and distance [
3,
4]. However, in real water environments like rivers and oceans, the underwater visible distance decreases severely due to turbulent effects; the nonuniform variation of light field distribution results in image distortion [
5,
6], which makes turbulence the most important degradation factor in natural water imaging. Therefore, it is necessary to study the underwater turbulent degradation in depth.
Some scholars have studied the degradation of underwater turbulence and its image recovery processing. Hou et al. [
7,
8,
9,
10,
11] used the underwater imaging degradation model to analyze the effects of suspended particles, turbulence, and path scattering on underwater optical imaging. Gero et al. [
12,
13,
14] established laboratory and field underwater turbulence experimental systems, conducted on-site measurements, and analyzed the influence of optical turbulence on the resolution of underwater imaging systems with quantitative data. Matt et al. [
15,
16] established a turbulent environment experimental platform with changeability and reproducibility. The Doppler velocimeter and the particle image velocimetry (PIV) system were used to analyze the fluid field and the computational fluid dynamics model was used to compensate for the measurement results. Farwell et al. [
17,
18] studied the intensity and coherence distribution of turbulence on underwater beam propagation based on the ocean turbulence power spectrum model, and performed a large number of numerical calculations.
There are three main types of research: theoretical calculation from the turbulent structure function and scattering characteristics; the establishment of an experimental system for simulating turbulence for laboratory and field measurement and analysis; and simulation experiments using the PIV method. Chen et al. have studied these three methods [
19,
20], and the results show that the turbulent flow field causes modulation transfer function (MTF) declines of the whole spatial frequency, and the path radiation and fluid media lead to a decrease of modulation contrast of the high spatial frequency. In fact, turbulence will affect imaging at both high and low frequencies due to the nonuniformity of light field, which causes image distortion. Therefore, it is necessary to study image processing methods that are specifically aimed at image distortion.
Hu et al. [
21] proposed a method based on the motion field kernel regression. Holohan et al. [
22] proposed the use of adaptive optics (AO) technology for image processing. Wen et al. [
23] proposed an underwater image reconstruction method based on motion compensation for high-quality image block selection and denoising. Kumar et al. [
24] proposed a two–stage image reconstruction method. In the first stage, the blind image quality (BIQ) metric and K–means clustering algorithm are used to select the reference frame and clear frame sequence, respectively. In the second stage, the pixel registration technology and two-dimensional interpolation technology are used to reconstruct the distorted image. Although this method can effectively alleviate the impact of turbulence, the computational complexity is high.
In terms of image distortion elimination, Ahn et al. [
25] introduced a convolutional neural network into image distortion classification. Mao [
26] proposed a 2D interpolation-based distortion correction technique for bistatic Synthetic Aperture Radar (SAR) polar format image formation. Sun et al. [
27] proposed an improved cubic chirplet decomposition method based on linked scatterers to solve the distortion problem for shipborne bistatic ISAR. There are other studies on the elimination of image distortion in different fields [
28,
29]. From the above, it can be seen that the elimination of image distortion is generally to carry out reverse operations based on the cause of distortion, among which are popular methods such as the use of neural networks to identify the types of distortion.
Therefore, in this paper, a self-defined metric is used to select the reference frame and the input frame sequence of the short exposure image with high clarity. Pixel registration and two-dimensional registration algorithms are used to suppress the distortion. The kernel correlation filtering algorithm is used to improve the speed and efficiency of the algorithm, which can reduce the amount of calculation and improve the deformity removal effect at the same time.
2. Theory and Methods
2.1. Two-Dimensional Pixel Registration Algorithm
The reference frame and the input frame sequence are selected according to the sharpness value of captured image frames. The sharpness of image can be calculated by [
24]:
where
is the mean value of
,
represents the number of directions selected, and
denotes the expected entropy of the image:
where
represents the size of the image,
is the measurement direction,
and
represent the discrete variables of time and frequency, respectively,
is the number of pixels, and
represents the complex conjugate of
.
We define
as the wave structure function of turbulence [
30]:
where
is the wavenumber equation,
is the wavelength (530 nm when calculated in this paper), and
ρ is the distance between two points on the cross section perpendicular to the transmission direction.
The reference frame can be selected as the input frame with the highest sharpness value, and frames with higher sharpness are kept as the input frame sequence for subsequent image processing.
The pixel shifting of the input frame sequence relative to the reference frame is calculated using the backward mapping method:
where
and
represent the mean values of pixel shift in the horizontal and vertical directions, respectively,
denotes a frame index, and
denotes the total number of reserved frame sequences.
The corrected shift of each pixel in all reserved input frames is derived from:
where
and
are the corrected displacements in the horizontal and vertical directions.
and
represent the inverse of
and
, respectively.
Then the corrected frames can be restored and reconstructed by:
where
represents the restored image,
represents the reconstructed image,
represents the sequence of reserved frames,
represents the angle of rotation, and
denotes the Gaussian estimation.
The recovered image can be used as the reference image for the next iteration. Through multiple iterations, the de-distortion effect will be better removed.
2.2. Support Vector Machine-Based Kernel Correlation Filtering Algorithm
When it comes to finding the optimal solution for Equations (7) and (8), the regularization constraint process can be used for limiting the iteration process, the main idea of which is to solve the mathematical ill-conditioned problem of finding the minimum value. The constraint algorithm in this paper combined the idea of kernel correlation filters (KCF) algorithm.
The expression of the regularization is as shown [
31]:
where
,
represent the original image and the observed image,
represents the minimum value,
represents the regularity factor, and
represents the penalty factor.
As a result, in the process of solving Equations (7) and (8), the goal of Equation (9) is to solve a best approximation solution of that can be defined as the Interval of functions.
The distance between a point in the sample space and the classification hyper plane is calculated as follows:
The distance between the support vector and the hyper plane is called the “interval” of the support vector machine (SVM), which can be expressed as follows:
The principle of a support vector machine is to maximize the interval, i.e., to minimize the
. The constrained optimization problem of linear classification can be expressed as follows:
Lagrangian functions can be constructed by introducing Lagrange multipliers into constraints:
Then the extremum can be obtained by summing partial derivatives:
can be obtained by substituting the above two conditions into the formula:
In turn,
can be solved as follows:
In the design process of the algorithm, in order to use the fuzzy sample image to train the least squares classifier and simplify the computation, a circular matrix can be constructed. We set:
where
is the parameter that controls overfitting.
Assuming that
is a
matrix, it can be obtained by cyclic shifting of a vector of
, from which
can be obtained:
The above matrix can be converted by
,
is the constructed core function:
Since the matrix is diagonal, Equation (11) can be converted to:
Then the discrete Fourier form of can be obtained by substituting Equation (20) into Equation (16), which can greatly reduce the amount of computation in the training process of the least squares classifier.
In summary, the kernel correlation matrix constructed in Equation (18) is substituted into Equation (16) to speed up the constrained optimization of classification in support vector machine for calculating the best approximation solution to Equation (9). Therefore, the factors affecting the speed of the algorithm are determined by the constructed core matrix. The constructed core functions of the matrix include radial basis function, point product kernel, weighted core, and so on. In this paper, the radial basis function core is selected. However, the factors affecting the accuracy of the algorithm return to the solution of Equations (7) and (8), where B in Equation (1) determines the input of the algorithm, and the kernels of restoration and reconstruction algorithms also have an impact on the accuracy. Thus, when the velocity of turbulence or visibility changes, the initial estimation function in Equations (7) and (8) will be changed to improve the algorithm.
3. Experimental Results and Analysis
In order to further verify the effectiveness of the proposed method, the experimental data for this paper were obtained through the laboratory simulation of a turbulent environment and field tests in a real turbulent ocean environment.
Due to the relationship between image distortion and the turbulent velocity field, the image restoration method based on PIV velocity field measurement is used as a verification method in the laboratory experiment system. Considering the follow abilities and light scattering characteristics of the tracer particles, common particle bubbles (which also have the advantage of being nonpolluting) are selected as tracer particles to measure the flow velocity field distribution of underwater turbulence. The probability density function of the bubble motion displacement can be described by the probability density function of time:
where
is the probability density function of time, which is a random variable subject to uniform distribution;
is the reciprocal of the relative displacement
;
is the speed of bubble motion, which can be estimated by the bubble dynamics equation; and
is the exposure time of the image sensor. Then the motion modulation transfer function of the bubble can be calculated by the one-dimensional Fourier transform of the probability density function of the relative displacement:
Thus, the MTF can be used as a priori knowledge of image restoration reconstruction algorithm.
In order to objectively analyze the processing results, this paper selects objective evaluation criteria of the non-reference ideal image as the quality assessment of image restoration and reconstruction, including the information capacity (IC), blur metric (BM), and gray average gradient (GMG). IC characterizes the richness of useful image information; the BM describes the degree of image distortion; the GMG reflects the image edge information. The larger the values of IC and GMG, the smaller the BM value, which denotes the better effects of image restoration and reconstruction. These evaluation criteria have been described in detail in previous articles published by the research team [
20], and so will not be repeated here [
19].
The BM is defined as follows:
where
and
represent different images in the vertical and horizontal directions.
is the pixel of coordinate
on the image plane, and
is the size of the image. Then the blur metric can be normalized by the range 0 to 1.
The IC is defined as follows:
where
represents the correlation between pixels,
and
represent the coordinates of the pixel,
is the imaging distance, and
represents the direction of association between the pixels.
The GMG is defined as follows:
where
denotes the point at coordinate
on image plane, and
is the size of the image.
3.1. Laboratory Experiments
An underwater turbulence experiment system is established in this paper. A 532 nm green semiconductor laser is used as the light source; images are captured by a high-speed COMS image sensor. The spot size of the laser is 10–20 mm, and its power is 200 mw. The experimental water tank is made of high-transmittance acrylic plate, so more than 90% of the laser source is irradiated on the target plate, and its size is 150 cm × 34 cm × 33 cm (length, height, width). Both the inlet and outlet are 40 mm round holes, at different heights to form turbulence with a water pump. The experimental system uses a circulating pump with a maximum head of 5 m and a maximum flow of 7.8 m
3/h to provide hydrodynamic power. The laser and sensor are 33 cm away from the target plate. In order to reduce the experimental error, the experiment was carried out in a dark environment. The three-dimensional structure of the experimental system is shown in
Figure 1.
The Reynolds number (Re) is used to determine whether the fluid is in a turbulent state. If Re >4000, the fluid state is turbulent. The flow rate of the water body is controlled by the flow meter and the water pump water valve. The pump drives the flow of water, and the valve controls the size of the flow. Turbulence occurs when the inlet flow reaches a certain speed. By controlling the water flow velocity at the water inlet of the water tank, turbulence of different strengths is obtained. The flow meter can read the velocity in real time, and then calculate the turbulent Reynolds number and turbulent intensity to ensure that the sample image is obtained in a turbulent environment.
The training platform of this algorithm is: the operating system is Ubuntu 14.04 (Canonical Ltd, London, England), the CPU is Core i7–9700K (Quad–core 4.9 GHz) (2200 Mission College Blvd. Santa Clara, CA 95054–1549 USA), and the graphics card is ASUS DUAL RTX2070–O8G–EVO (ASUS, Taipei City, Taiwan). The programming is performed in MATLAB R20017b (Apple Hill Drive, Natick, MA 01760–2098, USA). If the computer configuration is reduced or improved, the algorithm time will increase or decrease accordingly. If the image resolution increases, the number of training window travels in SVM will increase, and the algorithm time will increase accordingly. The image resolution of the sample images selected in this paper is cut to 800 × 600, and the scale factor of super-resolution reconstruction is set to 3.
3.1.1. Microturbulent Environment
When the water velocity of the inlet is 5 m/s, the target object is photographed 60 times by Charge Couple Device (CCD) sensor in 5 s. The captured image sequences are processed and compared by the proposed algorithm along with traditional blind restoration (BD) [
32], projection onto convex set reconstruction (POCS) [
33], the semi-blind restoration and reconstruction method based on the turbulent degradation model (M−SB) [
19], the total variation image super-resolution reconstruction technique based on L1 norm (M−TV) [
34], and a restoration and reconstruction method based on the PIV method (PIV−RR) [
35]. The sample image taken is shown in
Figure 2, with restored and reconstructed results shown in
Figure 3. The evaluation values for the images are listed and compared in
Table 1.
Table 2 shows the processing time of the algorithms.
It can be seen that the traditional BD method relieved a certain degree of blurring and introduced a large ringing effect, which is improved by the M−SB method, but the distortion of the image is not improved. The POCS and M−TV methods can improve the image resolution while improving the image sharpness, but the distortion of the image is also not improved. It can be seen that the image restoration and reconstruction algorithms have a significant effect in terms of improving resolution and deblurring, but they are not suitable for image distortion. This explains the necessity of the algorithm that is proposed by this paper. Based on the method in this paper, the distortion can be significantly improved, and the PIV−RR method also has a certain effect on the processing of image distortion.
As can be seen from
Table 1, the BM values of M–SB and PIV–RR method are smaller compared to those obtained by the other methods. Although the BM value of the method proposed in this paper is larger than that of the other two methods, it is not much larger. The IC value of the M−TV method is the largest, which is followed by the proposed method, while the PIV−RR method has a small value. The GMG value of both PIV−RR and the proposed method are larger than those of the other methods. It can be concluded that the proposed algorithm is not as good at deblurring as targeted image restoration, but it has advantages over the PIV−RR method in reconstruction.
As can be seen from
Table 2, the method proposed in this paper has obvious advantages in terms of the processing time.
3.1.2. Strong Turbulence Environment
When the water velocity of the inlet reaches 25 m/s, as can be seen in
Figure 4, the degree of distortion of the image is greatly increased. The results after image restoration and reconstruction are shown in
Figure 5. The evaluation values for the images are listed and compared in
Table 3, while the processing time is compared in
Table 4.
It can be seen from
Figure 5 that the traditional BD method has no obvious ringing effect, but the image is more blurred, which is the same as with the M−TV method. It can also be seen that the distortion of the image is not improved by the POCS and M−SB methods. In the case of strong turbulence, the method proposed in this paper obviously performs better than the PIV−RR method.
As can be seen from
Table 3, the BM values of the M−SB and PIV−RR methods are smaller, while the proposed method has a larger value. The IC value of the proposed method is the largest, while the PIV−RR method has a small value. The GMG values of both PIV−RR and the proposed method are larger than for the other methods. It can be seen from
Table 2 that the proposed method also has an obvious advantage in terms of processing time.
As a result, it can be concluded that, from a subjective point of view, the method proposed in this paper performs better than the other methods in terms of image distortion. Objectively speaking, the method proposed in this paper had performed poorly at deblurring, but is stronger in terms of image resolution and sharpness improvement compared to the reconstruction method. In particular, compared to the PIV−RR method, the two methods are comparable in the case of microturbulence. However, under strong turbulence, the method proposed in this paper is obviously better than the PIV−RR method from a subjective point of view. From the perspective of processing speed, the proposed method has an obvious advantage.
3.2. Field Tests
Tests in a turbulent water environment were carried out in the Yangtze River and South China Sea. Sample images were captured by an underwater packaging imaging system. The laser operated at 465−470 nm and CMOS image sensor are enclosed in a waterproof tank, and images captured by the image sensor were transferred to an image processing module. The attenuation coefficient of water is assumed to be a constant that does not change with wavelength in the observation range and can be measured by:
where
is depth;
is irradiance at a depth; and
is irradiance of the surface plane.
Figure 6 shows the schematic diagram of the experimental system, with physical properties listed in
Table 5.
The sample image and processed results are shown in
Figure 7 and
Figure 8. The evaluation values for the images are listed and compared in
Table 6 and
Table 7. The processing time of the algorithms are compared in
Table 8 and
Table 9. After laboratory experiments, the traditional BD and POCS methods are no longer used as comparisons in field experiments.
The experimental results in the river are similar to those in an environment of strong turbulence, while in the ocean the circumstances are more similar to microturbulence, so the effectiveness of laboratory experiments and the robustness of the proposed algorithm can be verified.
In order to further verify the validity of the algorithm, two sets of the TURBID dataset [
36] with a turbidity of I
10 are used for image enhancement and comparison with other latest underwater image enhancement methods. In recent years, the research on underwater image enhancement has mainly focused on mathematical methods such as estimation [
37,
38,
39,
40], fusion [
41], color correction [
42,
43,
44], and the combination of depth neural network [
45,
46,
47]. In this paper, Accurate Image Super-Resolution Using Very Deep Convolutional Networks (VDSR) [
48] is chosen for comparison as a deep neural network method. Zhang et al. [
49] proposed a medium transmission estimation method for underwater images based on joint prior distribution, which is also added for comparison. A future research direction is to introduce neural networks and adopt more new datasets, such as the Underwater Image Enhancement Benchmark Dataset (UIEBD) [
50,
51].
The processing results are shown in
Figure 9 and the evaluation results are given in
Table 10.
It can be seen from the experimental results that the VDSR and PIV methods are inferior, while Zhang’s method and the proposed method show good results, especially for the Chlorophyll dataset. According to the paper that proposed the datasets, the generation of turbidity mainly affects the scattering. Therefore, Zhang’s method for light scattering and the proposed method considering scattering displacement will achieve better recovery effects. The VDSR method, with its uncertain training process, and the PIV method, dependent on measured parameters, cannot achieve good results. It is noted that the time of VDSL is faster than that of the proposed method, but this is after training, and the time of training samples is not included. The method proposed in this paper can be used for both de–distortion and de–blurring, so it is more applicable.