A Fusion Underwater Salient Object Detection Based on Multi-Scale Saliency and Spatial Optimization

Huang, Weiliang; Zhu, Daqi; Chen, Mingzhi

doi:10.3390/jmse11091757

Open AccessArticle

A Fusion Underwater Salient Object Detection Based on Multi-Scale Saliency and Spatial Optimization

by

Weiliang Huang

¹

,

Daqi Zhu

^2,*

and

Mingzhi Chen

²

¹

Laboratory of Underwater Vehicles and Intelligent Systems, Shanghai Maritime University, Haigang Avenue 1550, Shanghai 201306, China

²

The School of Mechanical Engineering, University of Shanghai for Science and Technology, Jungong Road 516, Shanghai 200093, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2023, 11(9), 1757; https://doi.org/10.3390/jmse11091757

Submission received: 25 July 2023 / Revised: 25 August 2023 / Accepted: 3 September 2023 / Published: 8 September 2023

(This article belongs to the Section Ocean Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Underwater images contain abundant information, but many challenges remain for underwater object detection tasks. Various salient object detection methods may encounter low detection precision, and the segmented map has an incomplete region of the target object. To deal with blurry underwater scenes and vague detection problems, a novel fusion underwater salient object detection algorithm (FUSOD) is proposed based on multi-scale saliency and spatial optimization. Firstly, an improved underwater color restoration was utilized to restore the color information for afterward color contrast saliency calculation. Secondly, a more accurate multi-scale fusion saliency map was obtained by fully considering both the global and local feature contrast information. Finally, the fusion saliency was optimized by the proposed spatial optimization method to enhance the spatial coherence. The proposed FUSOD algorithm may process turbid and complex underwater scenes and preserve a complete structure of the target object. Experimental results on the USOD dataset show that the proposed FUSOD algorithm can segment the salient object with a comparatively higher detection precision than the other traditional state-of-the-art algorithms. An ablation experiment showed that the proposed spatial optimization method increases the detection precision by 0.0325 scores in the F-Measure.

Keywords:

multi-scale; spatial optimization; image processing; fusion underwater salient object detection

1. Introduction

Underwater images contain copious information and have been an imperative source of completely different, intriguing branches of technology and scientific research [1], such as underwater image enhancement [2], underwater color restoration [3], and underwater object detection [4]. However, the appropriateness of the nonexclusive classic models in real-time underwater robotic vision has been constrained. Underwater imagery suffers from poor visibility resulting from the attenuation of the propagated light, mainly due to absorption and scattering effects [5]. The turbid underwater environment makes object detection a challenging job. For decades, several attempts have been made to reestablish and enhance the visibility of such corrupted images.

As the deterioration of underwater scenes corresponds to the combination of multiplicative and added substance processes, conventional enhancing strategies such as gamma adjustment [6], histogram equalization [7], and color correction [8]. The existing underwater restoration strategies have sprung up in recent years. Several foggy dehazing techniques [9] have been introduced with great attention. However, underwater imagery is even more difficult to handle the extinction resulting from scattering depending on the light wavelength on the color component. The absorption substantially reduces the color information in underwater scenes which has the results of foggy object appearance contrast degradation.

On the other hand, detecting the most valuable object in the underwater scene has become an important and valuable subject of research. The human visual system has an effective attention mechanism to find the most important information from the visual scene. Computer vision aims to imitate the mechanism in two research branches: eye fixation detection and salient object detection. Our work focuses on the second one, and aims at detecting the most salient object in the underwater domain-specific field which may be composited with several different characteristic regions.

Eye prediction models typically try to detect a small set of points that represent human eye fixation, while salient object detection focuses on segmenting the whole extent of the salient object. Complete and accurate salient object detection has been not only a challenging job in underwater applications but also important for visual detection, scene comprehension, visual recognition, and other computer vision tasks [10,11]. The research on saliency measures may be classified into two groups according to the two human attention mechanism branches. The first saliency computation model is defined by Itti [12] which first brings the saliency biological model into the computer vision field. The model extracts color, intensity, and orientation features to calculate the saliency contrast information with visual biological mechanisms. After the biological model was proposed, a lot of computational biological models sprang up which may be referred to as the bottom-up models. Since bottom-up saliency methods largely depend on biological mechanisms and only calculate the local feature contrast information, they can only detect the edge and contour information without the complete whole body of the object. Liu [13] was the first algorithm to deal with the bottom-up problem and bring salient object detection into computer vision which may focus on detecting the whole region of the object. Liu computes the multi-scale color contrast on the original image and fuses the local feature with the global feature through CRF. Finally, the salient object is obtained by searching for the object that meets the threshold of the region. Since the result mostly depends on the handcraft threshold, detection performance varies from different scenes.

Traditional salient object detection often utilizes bottom-up saliency measures to make the coarse detection. They search the saliency map by a greedy algorithm or use the grab-cut segmentation algorithm to find the salient object while the region meets the handcraft threshold of the object. To avoid the high exhibition of time caused by the greedy search used before, Lampert [14] uses the ESS algorithm instead. Achanta [15] segments the original image into pieces by mean-shift algorithm and then uses an adaptive threshold method to find the region that has the double saliency value compared with the average value of the whole image. The former methods largely depend on the handcraft threshold, but neither of them gives a measure that can find the best searching threshold. To avoid the problem of choosing the best handcraft threshold, Luo [16] computes the saliency density of the image and then searches for the rectangle window which contains the max saliency density. Shi [17] constructs a model that describes the detection of a region by maximizing the saliency value among it. The final salient object is obtained by the iteration of the ESS algorithm, and without handcrafted parameters become the privilege of the method. However, the advanced searching method not only exhibits the amount of time but also has no accordance with the human fixation mechanism. Cheng [18] defines a color contrast calculation with sparse quantification and a smooth term to obtain a global saliency. The model initializes the detection process by binarizing the saliency map, the final object is obtained by iterating the grab-cut algorithm with the operation of corrosion and expansion. As mentioned above, both the eye fixation methods and the salient object detection methods encounter the spatial structure problem. In our work, a proposed spatial coherence optimization algorithm is utilized instead of CRF and ESS algorithm to preserve the spatial structure information.

With the population of deep learning, a lot of methods leverage the advantage of convolutional neural network to calculate the segmentation task. Li [19] makes the early trial with CNN to extract multi-scale deep features. Instead of using CRF to enhance the spatial coherence, a super-pixel level saliency map is calculated. To adjust the obscure edge problem made by FCN, Hou [20] makes a deep supervised learning with the short connection to obtain a more accurate saliency map with edge information. Luo [21] utilizes CNN as a feature extraction measure which may extract both the global and local features. Liu [22] proposes some pooling modules and makes a joint training with the edge detection. Qin [23] makes a CNN module with both a feature extraction and refinement module to adjust the final saliency map. Zhao [24] makes a multi-scale feature extraction with CNN which is emphasized by spatial and channel attention. Since FCN methods lead to obscure edge problems which may drop the detection precision, Zhou [25] extracts and fuses the feature with two streams to integrate saliency maps, contour cues, and their correlation. Pang [26] makes a multi-scale interactive network to enhance the performance of feature fusion. Li [27] makes a joint training network for camouflaged object detection. The fusion of side output from multi-scale deep learning features increases the segmentation performance. Inspired by the multi-scale features used in deep learning, both the global and local saliency contrast information are considered in our work to obtain a more accurate salient object location.

The saliency map generated by traditional saliency measures is vague without an accurate complete region of the salient object. Supervised measures largely depend on a lot of prior knowledge and training tricks. To address the problems mentioned above, a novel fusion underwater salient object detection (FUSOD) algorithm is proposed. An improved color restoration algorithm are utilized to restore the original image and enhance the detection precision. Both the global feature contrast information and the local eye fixation feature are considered to calculate the multi-scale saliency which may increase the object location accuracy. A novel spatial coherence optimization algorithm is proposed to preserve the spatial structure information. The proposed FUSOD algorithm will detect the most salient underwater object with a complete region not only the edge and corner information in the complex underwater images.

Contributions for fusion underwater salient object detection are shown below:

(1): An improved color restoration method is utilized to compensate for the red color channel and correct the turbid scene in underwater images which will improve the performance of underwater salient object detection.
(2): A novel spatial optimization algorithm is proposed to enhance spatial coherence. The algorithm may optimize the super-pixel level saliency and make the whole framework preserve with more spatial structure information.
(3): A novel fusion underwater salient object detection algorithm is proposed which fully utilizes the underwater scenes and comprehensively considers the global and local contrast information in the underwater image. The proposed algorithm can detect the underwater salient object completely and accurately. Experiment results on the USOD dataset show that both the qualitative evaluation and the quantitative evaluation of the proposed FUSOD algorithm have a comparatively higher performance than the other traditional salient object detection algorithm.

The rest of the paper is organized as follows. In Section 2, the novel proposed fusion algorithm is described in detail. In Section 3, simulation results are presented. Conclusions are presented in Section 4.

2. The Proposed FUSOD Algorithm

As mentioned above, underwater imagery suffers from poor visibility resulting from the attenuation of the propagated light. Traditional salient object detection methods may only segment the object with edge or contour information while supervised measure largely depends on prior knowledge and training tricks. To solve the problems mentioned above, an improved underwater color restoration method is utilized to compensate for the red color channel. Both the global contrast and local contrast feature are fully considered to obtain a more accurate multi-scale fusion saliency map. Finally, the proposed spatial optimization algorithm is utilized to smooth the saliency fusion map which may enhance the spatial coherence information. The proposed algorithm detects underwater salient object with a complete region not only the edge and contour information. The algorithm is organized into three parts, Section 2.1 demonstrates the color restoration algorithm for underwater images, Section 2.2 describes the calculation of the multi-scale fusion saliency map and Section 2.3 demonstrates the spatial optimization algorithm. The whole framework of the proposed FUSOD algorithm is shown in Figure 1. The result of each step from the entire framework can be seen in Figure 2.

2.1. Underwater Color Restoration

The underwater scene is different from the common transmission medium, which is mainly affected by the properties of the target objects and camera lens characteristics. Research [5] shows that seawater assimilates gradually different wavelengths of light as the complex circumstances of the sea and sunlight. Since red color is the longest wavelength, it is the first to be absorbed. Followed by orange and yellow color. Evidence shows that the green channel and blue channel are relatively well preserved in the underwater images. This means the green channel and blue channel contain complementary color information to compensate for the strong attenuation of the red channel.

The green channel and the blue channel are utilized to compensate for the red channel

I_{r c} (x)

for each pixel. Firstly, the mean color value

I_{m e a n} (x)

of the whole color channels is calculated, where

{\bar{I}}_{r} (x)

,

{\bar{I}}_{g} (x)

and

{\bar{I}}_{b} (x)

refer to the mean value of the red channel, green channel, and blue channel, respectively:

I_{m e a n} (x) = \frac{{\bar{I}}_{r} (x) + {\bar{I}}_{g} (x) + {\bar{I}}_{b} (x)}{3}

(1)

I_{r c} (x) = {\begin{matrix} I_{r} (x) + \partial | {\bar{I}}_{g} (x) - I_{m e a n} (x) | & {\bar{I}}_{g} (x) > {\bar{I}}_{b} (x) \\ I_{r} (x) + \partial | {\bar{I}}_{b} (x) - I_{m e a n} (x) | & {\bar{I}}_{g} (x) < {\bar{I}}_{b} (x) \end{matrix}

(2)

Then the red color is compensated depending on the value of the blue channel and green channel. Where

I_{g} (x)

,

I_{b} (x)

and

I_{r} (x)

represent the green, the blue color, and the red channel of the image

I

.

I_{r c} (x)

is defined as the compensated red color channel.

\partial

is a constant parameter appropriate for various illumination conditions and acquisition settings.

Finally, the comprehensive compensated color information of the underwater image is obtained, which is described as

I_{R G B} (x)

:

I_{R G B} (x) = I_{r c} (x) + I_{g} (x) + I_{b} (x)

(3)

After the red channel attenuation has been compensated, the Gray-World algorithm

W

is utilized to estimate and compensate the illuminant color cast. Finally, a gamma correction filter is utilized to enhance the underwater image color contrast.

{I_{R G B}}^{'} (x) = F_{g a m m a} (W (I_{R G B} (x)))

(4)

where

F_{g a m m a}

means the gamma filter and

{I_{R G B}}^{'} (x)

refers to the final pixel level restored underwater image. The white balance algorithm makes the restored underwater image a brighter scene. In addition, gamma correction makes much of the color located on the bright side.

The result of each step from the improved underwater color restoration is shown in Figure 3. The underwater color restoration makes the color brighter, which is beneficial for the color contrast saliency calculation afterward, and improves the performance of underwater salient object detection.

2.2. Multi-Scale Fusion Saliency

To obtain a more accurate saliency map, both the global and local contrast information of the underwater image are fully considered. The global contrast saliency calculates the color contrast by the specific pixel with each quantified pixel from the whole image. Finally, the saliency extent of each pixel is obtained, which is the global saliency. Local contrast saliency imitates the human eye fixation mechanism which searches the whole image with a local color contrast calculation window. The saliency extent of each pixel is calculated by the specific pixel with its surrounding sample pixels. With the linear aggregation, the final multi-scale fusion saliency map is obtained.

2.2.1. Global Contrast Saliency

Inspired by the assumption [18] that not all color in the image is useful to the detection task and the color quantification is conducted in the restored underwater image

{I_{R G B}}^{'} (x)

, the color attribute is obtained after the quantification, where

C_{R G B}

represents the quantified RGB color field,

c_{j}

responds to the

j t h

color, and

N

means the number of colors in the quantified RGB color field.

C_{R G B} = \sum_{j = 1}^{N} c_{j}

(5)

The pixel

x_{i}

that has a low frequency of occurrence during the quantification process may adopt a similar color to the adjacent pixel

x_{j}

which has a high frequency of occurrence. The similar color may be defined by color distance

D_{R G B} (x_{i}, x_{j})

calculation between each pixel:

D_{R G B} (x_{i}, x_{j}) = \sqrt{{(x_{i}^{r} - x_{j}^{r})}^{2} + {(x_{i}^{g} - x_{j}^{g})}^{2} + {(x_{i}^{b} - x_{j}^{b})}^{2}}

(6)

where

x_{i}^{r}

represents the red color channel of the pixel

x_{i}

. The same meaning is to

x_{i}^{g}

and

x_{i}^{b}

. Then the sparsity

f_{j}

of each color is calculated after the color quantification.

f_{j} = \frac{| c_{j} |}{| c_{R G B} |}

(7)

Then, each pixel

x_{i}

has its corresponding color attribute by the color distance calculation

D_{R G B} (x_{i}, x_{j})

. In addition, each pixel has its own color sparsity

f_{i}

through the color sparsity calculation.

Since the CIELab area can best simulate human perception, the global contrast saliency is defined by calculating the color contrast of each pixel with others from the whole image in the CIELab area, which is defined as:

C_{L a b} = {(l_{1}, a_{1}, b_{1}), (l_{2}, a_{2}, b_{2}), \dots, (l_{N}, a_{N}, b_{N})}

(8)

D i s t (x_{i}) = \sum_{j = 1}^{N} \sqrt{{(l_{i} - l_{j})}^{2} + {(a_{i} - a_{j})}^{2} + {(b_{i} - b_{j})}^{2}}

(9)

Then, a Min–Max normalization of

D i s t (x_{i})

is calculated to obtain the normalized color contrast

D i s t {(x_{i})}^{'}

.

D i s t {(x_{i})}^{'} = \frac{D i s t (x_{i}) - \min (D i s t (x_{i}))}{\max (D i s t (x_{i})) - \min (D i s t (x_{i}))}

(10)

The color quantification and sparsity calculation will not affect the detection precision while reducing the computation of the global color contrast. With the multiplication of each pixel color sparsity

f_{i}

, the final global contrast saliency map is obtained, which is

G l o S a l (x_{i})

. The heat map of global contrast saliency is shown in Figure 4.

G l o S a l (x_{i}) = f_{i} \times D i s t {(x_{i})}^{'}

(11)

2.2.2. Local Contrast Saliency

There is a lot of evidence showing that human eye fixation will always be on the center of the visual scene while ignoring the surrounding areas. The phenomenon is called the center–surround principle. The local contrast saliency is estimated by a center–surround approach [28,29], which may calculate the location of the target object accurately. A rectangle window is established which is jointly divided into a center window

H_{0}

and a surround window

H_{1}

. Then, the window is slid in the whole image to calculate the local contrast saliency. The local contrast saliency calculation can be conducted as a Bayesian center–surround model which computes the saliency of each pixel

x_{i}

under certain feature whether salient or not.

p (x_{s a l} | f) = \frac{p (x_{s a l}) p (f | x_{s a l}, H_{0})}{p (x_{s a l}) p (f | x_{s a l}, H_{0}) + (1 - p (x_{s a l})) p (f | x_{s a l}, H_{1})}

(12)

While the sample pixel is salient, the pixel is defined as

x_{s a l}

.

p (f | x_{s a l}, H_{0})

refers to the likelihood calculation of the sample salient pixel in the center window.

p (f | x_{s a l}, H_{1})

refers to the likelihood calculation in the surround window.

f

responds to the color feature utilized to calculate the contrast saliency information.

The color contrast in the CIELab color area is calculated by a local Gaussian filter [29]. The center window

H_{0}

is squeezed to a pixel to sparse the sample of pixels and reduce the operation. Finally, the calculation of center window likelihood

p (f | x_{s a l}, H_{0})

can be expanded as:

p (f | x_{s a l}, H_{0}) = \frac{1}{m \sqrt{2 π} δ} \sum_{i = 1}^{m} \exp (- \frac{‖ f - f_{i}^{H_{0}} ‖}{2 δ^{2}})

(13)

where

m

is the number of sample pixels in the center window and

δ

is a standard deviation.

f_{i}^{H_{0}}

refers to the

i t h

feature of the sample pixel in the center window

H_{0}

.

The calculation of surround window likelihood

p (f | x_{s a l}, H_{1})

can be expanded as:

p (f | x_{s a l}, H_{1}) = \frac{1}{n \sqrt{2 π} δ} \sum_{i = 1}^{n} \exp (- \frac{‖ f - f_{i}^{H_{1}} ‖}{2 δ^{2}})

(14)

where

n

refers to the number of sample pixels and

f_{i}^{H_{1}}

describes the

i t h

feature of the sample pixel in the surround window

H_{1}

. The squeezed circle module [29] is obtained to filter and calculate the likelihood of the surround window

p (f | x_{s a l}, H_{1})

of the image. The assumption that saliency is objective to each pixel is adopted and the probability density function

p (x_{s a l})

is made a constant [28]. Finally, with Bayesian integration of prior and likelihood information, the local contrast saliency

L o c S a l (x_{i})

can be obtained. The heat map of local contrast saliency is shown in Figure 5.

L o c S a l (x_{i}) = p (x_{s a l} | f)

(15)

2.2.3. Saliency Fusion

Following the aggregation algorithm proposed by A. Borji [11], the multi-scale saliency is fused by linear aggregation. Given a set of

m

saliency maps

{S_{i} | 1 \leq j \leq m}

computed from an image

I

, the aggregated saliency value

S (x_{i})

at pixel

x_{i}

of

I

is modeled as the probability:

\begin{array}{l} S (x_{i}) = P (y_{x_{i}} = 1 | S_{1} (x_{i}), S_{2} (x_{i}), \dots, S_{m} (x_{i})) \\ \propto \frac{1}{Z} \sum_{j = 1}^{m} ℓ (S_{j} (x_{i})) \end{array}

(16)

where

S_{j} (p)

represents the saliency value of the pixel

x_{i}

in the saliency map

S_{j}

,

y_{x_{i}}

is a binary random variable taking the value 1 if

x_{i}

is a salient pixel and 0 otherwise, and

Z

is a constant. The paper [11] implemented three different options for the function

ℓ (S_{j} (x_{i}))

in Equation (16).

For effective calculation, a modified fusion strategy of the global and local contrast saliency

F (x_{i})

is conducted by linear aggregation with weighting parameter

\partial

:

F (x_{i}) = \partial G l o S a l (x_{i}) + (1 - \partial) L o c S a l (x_{i})

(17)

where

\partial

is a trade-off parameter for balancing the global and local contrast saliency. The

\partial

is set as 0.3 to the underwater domain-specific field by an amount of trial.

2.3. Spatial Coherence Optimization

Traditional salient object detection methods often detect the object with vague edge points or corner information without a complete whole body of the salient object. The super-pixel segmentation is utilized to preserve the spatial coherence information and energy minimization function to optimize the super-pixel region between each other.

With the assumption that human vision intends to view similar perceptual parts of the scene as a whole, the super-pixel information used in this paper is considered as the whole perceptual part not only imitates the human eye mechanism but also preserves the image spatial coherence information. The SLIC algorithm is utilized to segment images into multiple pieces which is

s p_{i}

, and each super-pixel contains the complete edge and spatial coherence information. Meanwhile, the pixel among the super-pixel has the union compatibility feature which means the pixel inside each super-pixel can be represented as a linear or affine combination of all the other points. Since the pixel in the super-pixel has a distinctive attraction towards the human eye fixation, the mean saliency value is obtained by averaging the pixel-level saliency inside each super-pixel which is the super-pixel level saliency

s a l (s p_{i})

:

s a l (s p_{i}) = \frac{1}{| R |} \sum_{x_{j} \in R} F (x_{j})

(18)

R

represents the super-pixel region,

x_{j}

demonstrates the specific point in the super-pixel region.

Since the persistence of vision on an image varies from person to person, the generated saliency map can only demonstrate some kind of prior information about the salient object. It cannot describe the whole body of the salient object. We smooth the super-pixel saliency value, which is calculated before approaching the complete region of salient object by energy minimization function

E_{s} (s p_{i}^{s})

. With the hypothesis that the saliency values of two adjacent super-pixels will not vary too much if they have a similar color feature. The super-pixel saliency map is optimized by the smooth terms in the energy minimization function. The compactness restriction of the smooth term in RGB area will make the foreground of the saliency map more outstanding than the background. In addition, the saliency gradient of each adjacent super-pixel will not change too much.

E_{s} (s p_{i}^{s}) = {\sum_{i} (s p_{i}^{s} - s a l (s p_{i}))}^{2} + \sum_{i, j \in ε} ω_{i, j} {(s a l (s p_{i}) - s a l (s p_{j}))}^{2}

(19)

ω_{i, j} = \exp (- D_{R G B} (s p_{i}^{c}, s p_{j}^{c}))

(20)

E_{s} (s p_{i}^{s})

represents the energy minimization function.

s a l (s p_{i})

demonstrates the mean saliency value of each super-pixel calculated before.

ε

means the small set which contains the two adjacent super-pixels

s p_{i}

and

s p_{j}

.

s p_{i}^{c}

and

s p_{j}^{c}

respond to the mean RGB value of the two adjacent super-pixels.

D_{R G B} (s p_{i}^{c}, s p_{j}^{c})

calculates the color value distance between the two specific adjacent super-pixel regions mentioned before. Weight

ω_{i, j}

is bigger if the color attribute of the two adjacent super-pixels is similar.

Finally, the saliency map of the fusion underwater salient object detection algorithm is obtained by iteration of the energy minimization function. The final fusion underwater saliency map optimized by the energy minimization function has better performance than the traditional salient object detection method. The saliency gradient between two adjacent super-pixels will not change too much, and the foreground of the object inside the final saliency map is more outstanding than the background. The spatial coherence optimization algorithm of the fusion salient object detection algorithm will make the final detection more accurate. The result of each step from the proposed FUSOD algorithm is shown in Figure 6. The heat map of each step from the entire framework shows that the final segmented object will preserve spatial structure information benefited from the proposed spatial optimization algorithm.

3. Results

In this section, we first briefly describe the experiment implementation details, introduction to the benchmark datasets, and evaluation metrics. Then, experiments on the publicly available datasets are conducted to evaluate and analysis the performance of the proposed fusion underwater salient object detection algorithm. Both qualitative and quantitative evaluation experiments are conducted to verify the proposed FUSOD algorithm. The proposed underwater color restoration algorithm and spatial coherence optimization algorithm may be a plug-in module for any marine optical detection task. Evaluation of underwater color restoration and spatial optimization is analyzed by ablation experiments. Experiments will also be conducted to analysis the effect of the hyper-parameter

\partial

set in the underwater color restoration and the multi-scale fusion saliency algorithm.

Implementation details

The proposed and compared saliency methods are experimented with an Intel i5 8400 CPU with matlab2020b. Following the official hyper-parameters set in the benchmark [11], experiment results on the specific dataset are credible.

Dataset

USOD [30] is a challenging dataset to evaluate underwater SOD methods. The dataset combines the subset of three underwater datasets which is USR-248 [31], UIEB [32], and EUVP [33]. It contains 300 natural underwater images which are exhaustively compiled to ensure diversity in the object categories, waterbody, optical distortions, and aspect ratio of the salient objects. The combination of underwater datasets may make the USOD dataset more complex and difficult for underwater salient object detection tasks. The dataset provides binary segmentation images as ground truth for the evaluation of segmentation tasks.

Evaluation criteria

The quantitative evaluation of the salient object detection algorithm with other traditional saliency methods are conducted based on the widely used evaluation criteria:

(1): Precision-Recall (PR): PR is a standard performance metric and is complementary to mean absolute error. It is evaluated by binarizing the predicted saliency maps with a threshold sliding from 0 to 255 and then performing a bin-wise comparison with the ground truth values. A saliency map $S (x, y)$ is first converted to a binary mask $M$ with the ground truth $G$ :

$P r e c i s i o n = \frac{| M \cap G |}{| M |}$

(21)

$R e c a l l = \frac{| M \cap G |}{| G |}$

(22)

Achanta [15] proposes the image-dependent adaptive threshold for binarizing $S (x, y)$ , which is computed as twice as the mean saliency of $S (x, y)$ :

$T_{α} = \frac{2}{W \times H} \sum_{x = 1}^{W} \sum_{y = 1}^{H} S (x, y)$

(23)
(2): F-measure: F-measure is an overall performance measurement that is computed by the weighted harmonic mean of the precision and recall, where the parameter $β$ is often set to 0.3:

$F_{β} = \frac{(1 + β^{2}) \times P r e c i s i o n \times R e c a l l}{β^{2} \times P r e c i s i o n \times R e c a l l}$

(24)

Larger F-measure score indicates better performance.
(3): MAE: MAE is defined as the average pixel-wise absolute error between the prediction saliency map $S (x, y)$ and the binary ground truth $G (x, y)$ , both normalized in the range $[0, 1]$ . The smaller MAE indicates better performance.

$M A E = \frac{1}{H \times W} \sum_{y = 1}^{H} \sum_{x = 1}^{W} | S (x, y) - G (x, y) |$

(25)

where $H$ and $W$ refer to the height and width of the saliency map, respectively.

Nomenclature of the abbreviations

To improve the readability of the paper, a nomenclature table of the abbreviations utilized in the algorithm is presented below in Table 1. Each line describes the abbreviation which is used in the afterward experiments.

3.1. Experiment of Fusion Underwater Salient Object Detection Algorithm

Eight traditional saliency methods are chosen from the state-of-the-art benchmark. The eight methods are CA [34], COV [35], FES [29], SEG [28], SeR [36], SIM [37], SUN [38] and SWD [39] measure. Details of qualitative evaluation and quantitative evaluation are shown below:

3.1.1. Qualitative Evaluation

The proposed fusion underwater salient object detection algorithm (FUSOD) and the eight compared salient object detection measures are evaluated on different underwater scenes to imitate the complex underwater natural environment and evaluate the robustness performance of the algorithm. Several representative results are shown below to analyze the detection problem of the traditional SOD. Qualitative experiment results can be found in Figure 7.

As mentioned before, vague foreground and incomplete detection results are the drawbacks to traditional SOD algorithms. From the qualitative results above, there is only edge and contour information detected by the traditional bottom-up saliency methods of CA without a complete whole body of the salient object. COV and FES methods focus on locating the object with a vague object detection result. However, the coarse location information may be utilized as a prior knowledge of many computer vision tasks. SEG use CRF to preserve spatial coherence information with a low detection result on some image which may detect the non-salient background information. SeR and SIM segment a blurry foreground with a lot of background information. SUN method utilizes bottom-up saliency to predict human eye fixation incorporated with top-down information. However, the result contains a lot of points from background statistics distribution. SWD method also focuses on predicting human eye fixation while missing the complete target object. It utilizes biological mechanisms and evaluates the spatial-weight dissimilarity. From the results above, the detection result focuses on the center of the view. The proposed fusion underwater salient object detection algorithm (FUSOD) can generate a better saliency map that benefits from the color restoration algorithm, the multi-scale contrast calculation, and the spatial coherence optimization algorithm. The saliency map generated from the proposed algorithm shows that both the location and the complete shape of the object can be detected accurately.

3.1.2. Quantitative Evaluation

Standard evaluations, such as PR curve, F-measure, and MAE are utilized to evaluate the quantitative performance of the SOD algorithm in the underwater domain-specific field. Quantitative evaluation results may be found in Figure 8 and Figure 9, and Table 2. Experiment results show that SIM, SUN, and SeR have a comparatively low performance because they only use local saliency information and without an optimization refinement process. From the qualitative result above, SIM may detect a lot of background information which may increase the MAE score indicating fault detection. The FES method may have a comparatively high location capability but still with a lot of error information. CA combines local feature and global feature to detect the most salient eye fixation point but lost spatial information of the target object. The combination of multi-scale fusion saliency and spatial coherence optimization can make the proposed FUSOD algorithm not only preserve the complete region of the object body but also result in a high detection precision performance.

3.2. Ablation Study of Underwater Color Restoration

3.2.1. Qualitative and Quantitative Analysis of Underwater Color Restoration

The proposed fusion underwater salient object detection algorithm aims at segmenting the most salient object in a domain-specific area with a complete whole body accurately. The preprocessing of the turbid image is important for afterward color contrast calculation. So, we set an ablation experiment to analyze the effect of the color restoration algorithm on the proposed FUSOD algorithm. The details of experiment results are shown in Figure 10 and Figure 11, and Table 3.

The qualitative experiment result above shows that the restored underwater image may be a bit brighter than the original view. The corrected color may move towards the bright side which may benefit afterwards color contrast calculation.

The red plot (FUSOD) represents the proposed fusion underwater salient object detection algorithm with the color restoration preprocess step. While the blue one (FUSOD-ucr) means the fusion underwater SOD algorithm without underwater color restoration preprocess. From the quantitative evaluation results, the detection precision will be a bit lower if the color restoration preprocess algorithm is emitted from the whole framework. The color feature is conducted in the whole detection process, such as global contrast calculation, local contrast calculation, and spatial coherence optimization. The underwater images are distinctive from common images and color attenuation may have an important effect on optical-based applications and research. The color restoration algorithm will not hugely enhance the detection performance, but the color compensation and correction will benefit the amount of afterward tasks.

3.2.2. Experiment of Hyper-Parameter in Underwater Color Restoration

To find the best

\partial

in Equation (2), an experiment is set with different values of

\partial

. From the PR curve and F-measure score in Figure 12 and Table 4, the best

\partial

may be set as 0.2. Experiment results demonstrate that the detection performance will have a minor change while the hyper parameter

\partial

changes from 0.1 to 0.5. Evidence [5] shows that

\partial

equals to 0.1 may have a better color restoration result. We set

\partial

as 0.2 according to the experiment and the underwater domain-specific field.

3.3. Experiment of Hyper-Parameter in Multi-Scale Saliency Fusion

The constant parameter

\partial

in Equation (17) is a trade-off parameter to balance the multi-scale contrast saliency. The heat map of each step in the entire framework shows that the local contrast saliency may provide more location information than global contrast saliency. The PR curve and F-measure in Figure 13 and Table 5 demonstrate that

\partial

may be set as 0.3 to obtain the best performance.

3.4. Ablation Study of Spatial Coherence Optimization

The proposed spatial optimization algorithm is calculated with super-pixel segmentation and energy minimization function to optimize the coarse multi-scale fusion saliency. The method may enhance the spatial coherence and make the final saliency map segmented with a complete body of the salient object. It meets the physical rule that the whole system is stable while the value of the energy function is the lowest. So, the final FUSOD is optimized by iteration of energy minimization. We set an ablation experiment to evaluate the performance between the coarse pixel level multi-scale fusion saliency, super-pixel level saliency without energy minimization, and the final optimized FUSOD algorithm. Experiment results can be found in Figure 14 and Figure 15, and Table 6.

The FUSOD refers to the final optimized fusion underwater salient object detection algorithm. The FUSOD-p means the pixel-level coarse multi-scale fusion saliency map. The FUSOD-sp represents the super-pixel level saliency methods without energy minimization optimization. From the qualitative evaluation results presented in Figure 14, the coarse multi-scale fusion saliency may segment the salient object with global contrast information and local eye fixation features. The coarse saliency map is calculated by pixel from the image. The foreground segmentation may be the full extent of the salient object than the traditional bottom-up saliency methods, which means the fixation points and region information are both detected from the image. The super-pixel level saliency map preserves the spatial coherence information. However, both of the former saliency maps have a lot of fault background detection results which will result in a low precision and comparatively high MAE score. Finally, the saliency map segmented by the FUSOD algorithm may segment the foreground outstanding than the background with the help of energy minimization optimization. Both the foreground and background regions are smooth inside.

The quantitative evaluation results above show that the coarse multi-scale fusion saliency map (FUSOD-p) may have a low performance compared with the other two methods. The MAE column shows that the super-pixel level saliency segmentation (FUSOD-sp) may have a comparatively high F-measure but retain a lot of fault detection. The optimized FUSOD algorithm may have a high detection precision with a low MAE score.

4. Conclusions

In this paper, a novel fusion underwater salient object detection method for underwater domain-specific fields has been studied. The improved color restoration method shows that the compensation for the red channel and color correction may be beneficial for color contrast calculation and increase the underwater object detection accuracy. The multi-scale fusion saliency algorithm may obtain a more accurate object location in the complex underwater scene. The proposed spatial optimization method makes the segmented area preserve a complete object structure with the edge and contour information, which is beneficial for increasing detection precision in underwater turbid images. Various underwater scenes will be studied to increase the generalizability of the findings in the future.

Author Contributions

Conceptualization, W.H., D.Z. and M.C.; methodology, W.H.; software, W.H.; validation, W.H. and D.Z.; formal analysis, W.H. and D.Z.; investigation, M.C.; resources, W.H.; data curation, W.H.; writing—original draft preparation, W.H.; writing—review and editing, D.Z. and M.C.; visualization, W.H. and M.C.; supervision, D.Z. and M.C.; project administration, W.H.; funding acquisition, D.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This project is supported by the National Natural Science Foundation of China (62033009) and the Creative Activity Plan for Science and Technology Commission of Shanghai (21DZ2293500) and Science Foundation of Donhai Laboratory (DH-2022KF01013).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Access to the data will be considered upon request by the authors.

Conflicts of Interest

The authors declare no conflict of interest.

References

Jian, M.; Liu, X.; Luo, H.; Lu, X.; Yu, H.; Dong, J. Underwater image processing and analysis: A review. Signal Process. Image Commun. 2020, 91, 116088. [Google Scholar] [CrossRef]
Hu, K.; Weng, C.; Zhang, Y.; Jin, J.; Xia, Q. An overview of underwater vision enhancement: From traditional methods to recent deep learning. J. Mar. Sci. Eng. 2022, 10, 241. [Google Scholar] [CrossRef]
Zhou, J.; Yang, T.; Chu, W.; Zhang, W. Underwater image restoration via backscatter pixel prior and color compensation. Eng. Appl. Artif. Intell. 2022, 111, 104785. [Google Scholar] [CrossRef]
Lei, F.; Tang, F.; Li, S. Underwater target detection algorithm based on improved YOLOv5. J. Mar. Sci. Eng. 2022, 10, 310. [Google Scholar] [CrossRef]
Ancuti, C.O.; Ancuti, C.; De Vleeschouwer, C.; Bekaert, P. Color balance and fusion for underwater image enhancement. IEEE Trans. Image Process. 2017, 27, 379–393. [Google Scholar]
Tao, Y.; Dong, L.; Xu, L.; Chen, G.; Xu, W. An effective and robust underwater image enhancement method based on color correction and artificial multi-exposure fusion. Multimedia Tools Appl. 2023, 1–21. [Google Scholar] [CrossRef]
Fu, X.; Cao, X. Underwater image enhancement with global–local networks and compressed-histogram equalization. Signal Process. Image Commun. 2020, 86, 115892. [Google Scholar] [CrossRef]
Zhou, J.; Zhang, D.; Ren, W.; Zhang, W. Auto color correction of underwater images utilizing depth information. IEEE Geosci. Remote. Sens. Lett. 2022, 19, 1504805. [Google Scholar] [CrossRef]
Ghate, S.N.; Nikose, M.D. New Approach to Underwater Image Dehazing using Dark Channel Prior. J. Physics: Conf. Ser. 2021, 1937, 012045. [Google Scholar] [CrossRef]
Borji, A.; Cheng, M.-M.; Hou, Q.; Jiang, H.; Li, J. Salient object detection: A survey. Comput. Vis. Media 2019, 5, 117–150. [Google Scholar] [CrossRef]
Borji, A.; Cheng, M.-M.; Jiang, H.; Li, J. Salient object detection: A benchmark. IEEE Trans. Image Process. 2015, 24, 5706–5722. [Google Scholar] [CrossRef] [PubMed]
Itti, L.; Koch, C.; Niebur, E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 1254–1259. [Google Scholar]
Liu, T.; Yuan, Z.; Sun, J.; Wang, J.; Zheng, N.; Tang, X.; Shum, H.Y. Learning to detect a salient object. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 353–367. [Google Scholar]
Lampert, C.H.; Blaschko, M.B.; Hofmann, T. Efficient Subwindow Search: A Branch and Bound Framework for Object Localization. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 31, 2129–2142. [Google Scholar] [CrossRef]
Achanta, R.; Süsstrunk, S. Frequency-tuned salient region detection. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009. [Google Scholar]
Luo, Y.; Yuan, J.; Xue, P.; Tian, Q. Saliency density maximization for efficient visual objects discovery. IEEE Trans. Circuits Syst. Video Technol. 2011, 21, 1822–1834. [Google Scholar] [CrossRef]
Shi, R.; Liu, Z.; Du, H.; Zhang, X.; Shen, L. Region diversity maximization for salient object detection. IEEE Signal Process. Lett. 2012, 19, 215–218. [Google Scholar] [CrossRef]
Cheng, M.M.; Mitra, N.J.; Huang, X.; Torr, P.H.; Hu, S.M. Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 569–582. [Google Scholar]
Li, G.; Yu, Y. Visual saliency based on multiscale deep features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015. [Google Scholar]
Hou, Q.; Cheng, M.M.; Hu, X.; Borji, A.; Tu, Z.; Torr, P.H. Deeply supervised salient object detection with short connections. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
Luo, Z.; Mishra, A.; Achkar, A.; Eichel, J.; Li, S.; Jodoin, P.M. Non-local deep features for salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
Liu, J.J.; Hou, Q.; Cheng, M.M.; Feng, J.; Jiang, J. A simple pooling-based design for real-time salient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
Qin, X.; Zhang, Z.; Huang, C.; Gao, C.; Dehghan, M.; Jagersand, M. Basnet: Boundary-aware salient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
Zhao, T.; Wu, X. Pyramid feature attention network for saliency detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
Qin, X.; Zhang, Z.; Huang, C.; Gao, C.; Dehghan, M.; Jagersand, M. Interactive two-stream decoder for accurate and fast saliency detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
Pang, Y.; Zhao, X.; Zhang, L.; Lu, H. Multi-scale interactive network for salient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
Li, A.; Zhang, J.; Lv, Y.; Liu, B.; Zhang, T.; Dai, Y. Uncertainty-aware joint salient object and camouflaged object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
Rahtu, E.; Kannala, J.; Salo, M.; Heikkilä, J. Segmenting salient objects from images and videos. In Proceedings of the European Conference on Computer Vision, Heraklion, Greece, 5–11 September 2010. [Google Scholar] [CrossRef]
Tavakoli, H.R.; Rahtu, E.; Heikkilä, J. Fast and efficient saliency detection using sparse sampling and kernel density estimation. In Proceedings of the Scandinavian Conference on Image Analysis, Ystad, Sweden, 1 May 2011. [Google Scholar]
Islam, M.J.; Wang, R.; de Langis, K.; Sattar, J. Svam: Saliency-guided visual attention modeling by autonomous underwater robots. arXiv 2020, arXiv:2011.06252. [Google Scholar]
Islam, M.J.; Enan, S.S.; Luo, P.; Sattar, J. Underwater image super-resolution using deep residual multipliers. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020. [Google Scholar]
Li, C.; Guo, C.; Ren, W.; Cong, R.; Hou, J.; Kwong, S.; Tao, D. An underwater image enhancement benchmark dataset and beyond. IEEE Trans. Image Process. 2019, 29, 4376–4389. [Google Scholar] [CrossRef]
Islam, J.; Xia, Y.; Sattar, J. Fast underwater image enhancement for improved visual perception. IEEE Robot. Autom. Lett. 2020, 5, 3227–3234. [Google Scholar] [CrossRef]
Goferman, S.; Zelnik-Manor, L.; Tal, A. Context-aware saliency detection. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 34, 1915–1926. [Google Scholar]
Erdem, E.; Erdem, A. Visual saliency estimation by nonlinearly integrating features using region covariances. J. Vis. 2013, 13, 11. [Google Scholar] [CrossRef]
Seo, H.J.; Milanfar, P. Static and space-time visual saliency detection by self-resemblance. J. Vis. 2009, 9, 15. [Google Scholar] [CrossRef] [PubMed]
Murray, N.; Vanrell, M.; Otazu, X.; Parraga, C.A. Saliency estimation using a non-parametric low-level vision model. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, 20–25 June 2011. [Google Scholar] [CrossRef]
Zhang, L.; Tong, M.H.; Marks, T.K.; Shan, H.; Cottrell, G.W. SUN: A Bayesian framework for saliency using natural statistics. J. Vis. 2008, 8, 32. [Google Scholar] [CrossRef] [PubMed]
Duan, L.; Wu, C.; Miao, J.; Qing, L.; Fu, Y. Visual saliency detection by spatially weighted dissimilarity. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, 20–25 June 2011. [Google Scholar] [CrossRef]

Figure 1. Whole framework of the proposed fusion underwater salient object detection algorithm.

Figure 2. The result of each step from the proposed FUSOD framework.

Figure 3. The result of each step from the improved underwater color restoration algorithm. (a) The original image, (b) the color compensation, (c) the gray world, (d) the final restored image.

Figure 4. The heat map of the global contrast saliency. (a) original image, (b) global contrast saliency.

Figure 5. The heat map of the local contrast saliency. (a) original image, (b) local contrast saliency.

Figure 6. The result of each step from the proposed FUSOD algorithm. (a) The global contrast, (b) the local contrast, (c) the linear fusion, (d) the super-pixel saliency, (e) the energy minimization.

Figure 7. Qualitative evaluation of the proposed FUSOD algorithm with other saliency methods on USOD dataset. (a) The original image, (b) CA method, (c) COV method, (d) FES methods (e) SEG method, (f) SeR method, (g) SIM method, (h) SUN method, (i) SWD method, (j) proposed FUSOD method, (k) ground truth.

Figure 8. PR curve of different algorithms on USOD dataset, which include FUSOD, CA, COV, FES and SEG methods.

Figure 9. PR curve of different algorithms on USOD dataset, which include FUSOD, SeR, SIM, SUN and SWD method.

Figure 10. Comparison of the underwater image before and after color restoration algorithm. (a) Original image, (b) color histogram of the original image, (c) restored image, (d) color histogram of the restored image.

Figure 11. PR curve of the underwater color restoration ablation experiment.

Figure 12. Experiments of hyper-parameter in underwater color restoration.

Figure 13. Experiments of hyper-parameter in multi-scale fusion saliency method.

Figure 14. Qualitative evaluation of the spatial optimization ablation experiment. (a) Original image, (b) pixel-level FUSOD saliency without super-pixel calculation, (c) super-pixel level saliency without energy minimization optimization, (d) proposed FUSOD method with super-pixel calculation and energy minimization optimization, (e) ground truth.

Figure 15. PR curve of the spatial optimization ablation experiment.

Table 1. The nomenclature of the abbreviations in the experiment.

Abbreviations	Signature
FUSOD	The proposed fusion underwater salient object detection algorithm.
FUSOD-p	The pixel-level fusion underwater salient object detection algorithm which means the FUSOD algorithm without the proposed spatial optimization module.
FUSOD-sp	The super-pixel level fusion underwater salient object detection algorithm which means the FUSOD algorithm without the proposed energy minimization optimization module.
FUSOD-ucr	The FUSOD algorithm without underwater color restoration process.

Table 2. F-measure and MAE score of different saliency algorithms.

Methods	F-Measure	MAE
CA	0.5433	0.260618
COV	0.5525	0.205216
FES	0.5848	0.314445
SEG	0.5718	0.331153
SeR	0.5167	0.320286
SIM	0.4890	0.381647
SUN	0.5304	0.340639
SWD	0.5902	0.287874
FUSOD	0.6282	0.195531

Table 3. F-measure and MAE score of the underwater color restoration ablation experiment.

Methods	F-Measure	MAE
FUSOD	0.6282	0.195531
FUSOD-ucr	0.6089	0.213947

Table 4. F-measure and MAE score of the different

\partial

in the underwater color restoration method.

Table 4. F-measure and MAE score of the different

\partial

in the underwater color restoration method.

Methods	F-Measure	MAE
∂ = 0.1	0.6250	0.196534
∂ = 0.2	0.6282	0.195531
∂ = 0.3	0.6242	0.196529
∂ = 0.4	0.6191	0.196657
∂ = 0.5	0.6180	0.198521

Table 5. F-measure and MAE score of the different

\partial

in the multi-scale fusion saliency method.

Table 5. F-measure and MAE score of the different

\partial

in the multi-scale fusion saliency method.

Methods	F-Measure	MAE
$\partial = 0.1$	0.6240	0.197521
$\partial = 0.2$	0.6202	0.196540
$\partial = 0.3$	0.6282	0.195531
$\partial = 0.4$	0.6159	0.201530
$\partial = 0.5$	0.6031	0.204571

Table 6. F-measure and MAE of the spatial optimization ablation experiment.

Methods	F-Measure	MAE
FUSOD	0.6282	0.195531
FUSOD-p	0.5957	0.282547
FUSOD-sp	0.6049	0.312752

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, W.; Zhu, D.; Chen, M. A Fusion Underwater Salient Object Detection Based on Multi-Scale Saliency and Spatial Optimization. J. Mar. Sci. Eng. 2023, 11, 1757. https://doi.org/10.3390/jmse11091757

AMA Style

Huang W, Zhu D, Chen M. A Fusion Underwater Salient Object Detection Based on Multi-Scale Saliency and Spatial Optimization. Journal of Marine Science and Engineering. 2023; 11(9):1757. https://doi.org/10.3390/jmse11091757

Chicago/Turabian Style

Huang, Weiliang, Daqi Zhu, and Mingzhi Chen. 2023. "A Fusion Underwater Salient Object Detection Based on Multi-Scale Saliency and Spatial Optimization" Journal of Marine Science and Engineering 11, no. 9: 1757. https://doi.org/10.3390/jmse11091757

APA Style

Huang, W., Zhu, D., & Chen, M. (2023). A Fusion Underwater Salient Object Detection Based on Multi-Scale Saliency and Spatial Optimization. Journal of Marine Science and Engineering, 11(9), 1757. https://doi.org/10.3390/jmse11091757

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Fusion Underwater Salient Object Detection Based on Multi-Scale Saliency and Spatial Optimization

Abstract

1. Introduction

2. The Proposed FUSOD Algorithm

2.1. Underwater Color Restoration

2.2. Multi-Scale Fusion Saliency

2.2.1. Global Contrast Saliency

2.2.2. Local Contrast Saliency

2.2.3. Saliency Fusion

2.3. Spatial Coherence Optimization

3. Results

3.1. Experiment of Fusion Underwater Salient Object Detection Algorithm

3.1.1. Qualitative Evaluation

3.1.2. Quantitative Evaluation

3.2. Ablation Study of Underwater Color Restoration

3.2.1. Qualitative and Quantitative Analysis of Underwater Color Restoration

3.2.2. Experiment of Hyper-Parameter in Underwater Color Restoration

3.3. Experiment of Hyper-Parameter in Multi-Scale Saliency Fusion

3.4. Ablation Study of Spatial Coherence Optimization

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI