A New Hybrid Descent Algorithm for Large-Scale Nonconvex Optimization and Application to Some Image Restoration Problems

Wang, Shuai; Wang, Xiaoliang; Tian, Yuzhu; Pang, Liping

doi:10.3390/math12193088

Open AccessArticle

A New Hybrid Descent Algorithm for Large-Scale Nonconvex Optimization and Application to Some Image Restoration Problems

¹

School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China

²

Department of Mathematics and Science, School of Science, Zhejiang Sci-Tech University, Hangzhou 310018, China

³

School of Mathematics, Liaoning Normal University, Dalian 116026, China

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(19), 3088; https://doi.org/10.3390/math12193088

Submission received: 18 August 2024 / Revised: 25 September 2024 / Accepted: 27 September 2024 / Published: 2 October 2024

(This article belongs to the Special Issue Optimization Algorithms: Theory and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Conjugate gradient methods are widely used and attractive for large-scale unconstrained smooth optimization problems, with simple computation, low memory requirements, and interesting theoretical information on the features of curvature. Based on the strongly convergent property of the Dai–Yuan method and attractive numerical performance of the Hestenes–Stiefel method, a new hybrid descent conjugate gradient method is proposed in this paper. The proposed method satisfies the sufficient descent property independent of the accuracy of the line search strategies. Under the standard conditions, the trust region property and the global convergence are established, respectively. Numerical results of 61 problems with 9 large-scale dimensions and 46 ill-conditioned matrix problems reveal that the proposed method is more effective, robust, and reliable than the other methods. Additionally, the hybrid method also demonstrates reliable results for some image restoration problems.

Keywords:

hybrid conjugate gradient method; acceleration scheme; sufficient descent property; global convergence; ill-conditioned matrix; image restoration

MSC:

65K05; 90C26

1. Introduction

In this paper, we consider the following unconstrained problem:

min_{x \in R^{n}} f (x),

(1)

where

f : R^{n} \to R

is continuously differentiable, bound below, and its gradient is available. There are many effective methods for problem (1), such as Newton-type methods, quasi-Newton-type methods, spectral gradient methods, and conjugate gradient (CG for abbreviation) methods [1,2,3,4,5,6,7,8,9,10,11], etc. Meanwhile, there are also various free gradient optimization tools such as Nelder–Mead, generalized simulated annealing, and genetic algorithm [12,13,14], etc., for problem (1). In this part, we focus on CG methods and propose a new hybrid CG method for a large-scale problem (1). Actually, CG methods are one of the most effective methods for unconstrained problems, especially for large-scale cases, due to their low storage and globally convergent properties [3], in which the iterative point is usually generated by

x_{k + 1} = x_{k} + α_{k} d_{k}, k = 0, 1, \dots,

(2)

where

x_{k}

is the current iteration; the scalar

α_{k} > 0

is the step length, determined by some line search strategy; and

d_{k}

is the search direction, defined by

\begin{matrix} d_{k} = \{\begin{matrix} - g_{k}, & if k = 0, \\ - g_{k} + β_{k} d_{k - 1}, & if k \geq 1, \end{matrix} \end{matrix}

(3)

where

g_{k} : = g (x_{k}) = \nabla f (x_{k})

and

β_{k}

is called the conjugate parameter. A number of CG methods have been proposed by various modifications of the direction

d_{k}

and the parameter

β_{k}

; see [4,5,6,7,8,9,10,11,15,16,17,18,19,20], etc. Some CG methods have strong convergence properties, but their numerical performances may not be good in practice due to the jamming phenomenon [4]. These methods include Fletcher–Reeves (FR) [5], Dai–Yuan (DY) [6], and Fletcher (CD) [7], with the following conjugate parameters:

β_{k + 1}^{F R} = \frac{∥ g_{k + 1} ∥^{2}}{∥ g_{k} ∥^{2}}, β_{k + 1}^{D Y} = \frac{∥ g_{k + 1} ∥^{2}}{y_{k}^{T} d_{k}}, β_{k + 1}^{C D} = - \frac{∥ g_{k + 1} ∥^{2}}{g_{k}^{T} d_{k}},

where

g_{k + 1} = \nabla f (x_{k + 1})

,

y_{k} = g_{k + 1} - g_{k}

, and

∥ \cdot ∥

stands for the Euclidean norm. On the other hand, some other CG methods may perform well in practice, but their convergence may be not guaranteed, especially for nonconvex functions. These methods include Hestenes–Stiefel (HS) [8], Polak–Ribière–Polyak (PRP) [9,10], and Liu–Storey (LS) [11], with the following conjugate parameters:

β_{k + 1}^{H S} = \frac{g_{k + 1}^{T} y_{k}}{y_{k}^{T} d_{k}}, β_{k + 1}^{P R P} = \frac{g_{k + 1}^{T} y_{k}}{∥ g_{k} ∥^{2}}, β_{k + 1}^{L S} = - \frac{g_{k + 1}^{T} y_{k}}{g_{k}^{T} d_{k}} .

In fact, these methods possess an automatically approximate restart feature which can avoid the jamming phenomenon, that is, when the step

s_{k}

is small, the factor

y_{k}

tends to zero, resulting in the conjugate parameter

β_{k + 1}

becoming small and the new direction

d_{k + 1}

approximating to the steepest descent direction

- g_{k + 1}

.

To attain good computational performance and maintain the attractive feature of strong global convergence, many scholars have paid special attention to hybridizing these CG methods. Specifically, the authors in [21] proposed a hybrid PRP-FR CG method (H1 method in [22]) and the corresponding conjugate parameter was defined as

β_{k + 1}^{H 1} = max \{0, min {β_{k + 1}^{F R}, β_{k + 1}^{P R P}}\}

. Moreover, based on the above hybrid conjugate parameter, a new form was proposed in [23], where the parameter was defined by

β_{k + 1} = max {- β_{k + 1}^{F R}, min {β_{k + 1}^{F R}, β_{k + 1}^{P R P}}}

, and the global convergence property was established for the general function without the convexity assumption. In [24], a hybrid of the HS method and DY method was proposed in which the conjugate parameter was defined by

β_{k + 1}^{H 2} = max \{0, min {β_{k + 1}^{H S}, β_{k + 1}^{D Y}}\} .

The numerical results indicated that the above hybrid method was more effective than the PRP algorithm. In the above hybrid CG methods, the search direction was in the form of (3). Moreover, the authors in [25] proposed a new hybrid three-term method in which the conjugate parameter is

β_{k + 1}^{H 2}

and the direction is

d_{k + 1} = - g_{k + 1} + (1 - λ_{k + 1}) β_{k + 1}^{H 2} d_{k} + λ_{k + 1} θ_{k + 1} g_{k},

where

λ_{k + 1}

is the convex parameter. The above hybrid method demonstrates attractive numerical performance. Furthermore, in [22], the authors proposed two new hybrid methods based on the above conjugate parameters with different search directions. Concretely, the directions have the following common form:

d_{k + 1} = - (1 + β_{k + 1} \frac{g_{k + 1}^{T} d_{k}}{∥ g_{k + 1} ∥^{2}}) g_{k + 1} + β_{k + 1} d_{k},

(4)

where

β_{k + 1} = β_{k + 1}^{H 1}

or

β_{k + 1} = β_{k + 1}^{H 2}

. A remarkable feature of the above directions is that the sufficient descent property is automatically satisfied, independent of the accuracy of the line search strategy.

Motivated by the above discussions, in this paper, we propose a new hybrid descent CG method for large-scale nonconvex problems. The proposed hybrid method automatically enjoys the sufficient descent property independent of the accuracy of the line search technique. Furthermore, the global convergence for the general functions without convexity is established under the standard conditions. Numerical results of 549 large-scale problems and 46 ill-conditioned matrix problems indicate the proposed method is attractive and promising. Finally, we also apply the proposed method to some image restoration problems, which also verifies its reliability and effectiveness.

The rest of the paper is organized as follows. In Section 2, we propose a descent hybrid CG method which is based on the MHS method and DY method. Moreover, the sufficient descent property is satisfied independent of the accuracy of the line search techniques. Global convergence is established for the general function in Section 3. Numerical results are given in Section 4 to indicate the effectiveness and reliability of the proposed algorithm. Finally, some conclusions are presented.

2. Motivation, Algorithm, and Sufficient Descent Property

As mentioned in the above section, the HS method is generally regarded as one of the most effective CG methods, but its global convergence for general nonlinear functions is still erratic. Additionally, the HS method does not guarantee the descent property during the iterative process, that is, the condition

g_{k}^{T} d_{k} < 0

may not be satisfied for

\forall k \geq 1

. Therefore, many researchers have been devoted to designing some descent HS conjugate gradient methods [4,24,26,27,28,29,30], etc. Specifically, to obtain an intuitively modified conjugate parameter, the authors in [26] approximated the direction

d_{k + 1}^{T H S}

by the two-term direction (3), where

d_{k + 1}^{T H S}

was defined by (4) with

β_{k + 1} = β_{k + 1}^{H S}

. Concretely, the least squares problem

{min}_{β} {∥ - g_{k + 1} + β_{k + 1} d_{k} - d_{k + 1}^{T H S} ∥}^{2}

was solved. After some algebraic manipulations, the unique solution was

β_{k + 1}^{M H S} = \frac{g_{k + 1}^{T} y_{k}}{y_{k}^{T} d_{k}} (1 - \frac{{(g_{k + 1}^{T} d_{k})}^{2}}{∥ g_{k + 1} ∥^{2} {∥ d_{k} ∥}^{2}}) = β_{k + 1}^{H S} ϑ_{k},

(5)

where

ϑ_{k} = 1 - \frac{{(g_{k + 1}^{T} d_{k})}^{2}}{∥ g_{k + 1} ∥^{2} {∥ d_{k} ∥}^{2}} .

(6)

The above parameter and its modifications have some nice theoretical properties [26] and the method with (5) and (3) performs well. Meanwhile, it is clear that if the exact line search is adopted (i.e.,

g_{k + 1}^{T} d_{k} = 0

), it holds that

β_{k + 1}^{M H S} = β_{k + 1}^{H S} = β_{k + 1}^{P R P}

.

To attain attractive computational performance and good theoretical properties, many researchers have proposed hybrid CG methods. Among these methods, hybridizations of the HS method and the DY method have shown promising numerical performance [31,32,33,34], etc. The HS method has a nice property of automatically satisfying the conjugate condition

d_{k + 1}^{T} y_{k} = 0

for

\forall k \geq 0

independent of the accuracy of the line search strategies and the convexity of the objective function and performs well in practice. On the other hand, the DY method has remarkable convergence properties. These characteristics motivate us to propose new hybridizations of the HS method and the DY method which not only have attractive theoretical properties but also better numerical performance for large-scale nonconvex problems.

In the following, we focus on the conjugate parameter

β_{k + 1}^{M H S}

and propose a new hybrid conjugate parameter of

β_{k + 1}^{D Y}

and

β_{k + 1}^{M H S}

:

β_{k + 1}^{N} = max \{0, min \{β_{k + 1}^{D Y}, β_{k + 1}^{M H S}\}\} .

(7)

Now, based on the new hybrid conjugate parameter

β_{k + 1}^{N}

and the modified descent direction (8), we propose our hybrid algorithm (NMHSDY) in detail.

It should be noted that the line search technique in Algorithm 1 is not fixed: It can be selected by the users. Next, we show that the search direction

d_{k}

generated by Algorithm 1 automatically has a sufficient descent property independent of any line search strategy.

Algorithm 1 New descent hybrid algorithm of MHS and DY methods (NMHSDY) for nonconvex functions.

Step 0. Input and Initialization. Select an initial point

x_{0} \in R^{n}

, parameter

ε \geq 0

and compute

f_{0} = f (x_{0})

and

g_{0} = g (x_{0})

. Set

d_{0} = - g_{0}

and

k = 0

;

Step 1. If

∥ g_{k} ∥ \leq ε

, then stop;

Step 2. Compute step length

α_{k}

along direction

d_{k}

by some line search strategy;

Step 3. Let

x_{k + 1} = x_{k} + α_{k} d_{k}

;

Step 4. Compute the conjugate parameter

β_{k + 1}^{N}

by (7) and the search direction

d_{k + 1}

by

d_{k + 1} = - (1 + β_{k + 1}^{N} \frac{g_{k + 1}^{T} d_{k}}{∥ g_{k + 1} ∥^{2}}) g_{k + 1} + β_{k + 1}^{N} d_{k},

(8)

Step 5. Set

k : = k + 1

and go to Step 2.

Theorem 1.

Let the search direction

d_{k}

be defined by (8) in Algorithm 1. Then, for any line search strategy, the sufficient descent property holds for nonconvex function

f (x)

, that is,

g_{k}^{T} d_{k} = - {∥ g_{k} ∥}^{2}, \forall k \geq 0 .

(9)

Proof.

By the definition of

d_{k + 1}

in (8), we have

g_{k + 1}^{T} d_{k + 1} = - ∥ g_{k + 1} ∥^{2} - β_{k + 1} g_{k + 1}^{T} d_{k} + β_{k + 1} g_{k + 1}^{T} d_{k} = - {∥ g_{k + 1} ∥}^{2} .

Since

d_{0} = - g_{0}

, then

d_{0}^{T} g_{0} = - {∥ g_{0} ∥}^{2}

. All in all, (9) holds. This completes the proof. □

3. Convergence for General Nonlinear Functions

In this section, the global convergence of the NMHSDY method is presented. Before that, some common assumptions are listed.

Assumption 1.

The level set

L = {x \in R^{n} : f (x) \leq f (x_{0})}

is bounded, where

x_{0}

is the initial point, i.e., there exists a positive constant

D > 0

such that

∥ x ∥ \leq D, \forall x \in L .

(10)

Assumption 2.

In some neighborhood

N

of

L

, the gradient

g (x) = \nabla f (x)

is Lipschitz continuous, i.e., there exists a constant

L_{1} > 0

such that

∥ g (x) - g (y) ∥ \leq L_{1} ∥ x - y ∥, \forall x, y \in N .

(11)

Based on the above assumptions, we further obtain that there exists a constant

M > 0

such that

∥ g (x) ∥ \leq M, \forall x \in L .

(12)

In fact, it holds that

∥ g (x) ∥ = ∥ g (x) - g (x_{0}) + g (x_{0}) ∥ \leq ∥ g (x) - g (x_{0}) ∥ + ∥ g (x_{0}) ∥ \leq L_{1} ∥ x - x_{0} ∥ + ∥ g (x_{0}) ∥ \leq 2 L_{1} D + ∥ g (x_{0}) ∥

; hence, M can be

2 L_{1} E + ∥ g (x_{0}) ∥

or larger than that.

The line search strategy is another important element in iterative methods. In this part, we take the standard Wolfe line search strategy:

f (x_{k} + α_{k} d_{k}) \leq f_{k} + σ_{1} α_{k} g_{k}^{T} d_{k}, g {(x_{k} + α_{k} d_{k})}^{T} d_{k} \geq σ_{2} g_{k}^{T} d_{k},

(13)

where

0 < σ_{1} < σ_{2} < 1

. By property (9) and line search (13), it is satisfied that

f_{k + 1} \leq f_{k} - σ_{1} α_{k} {∥ g_{k} ∥}^{2} \leq f_{k},

that is, the sequence

{f_{k}}

is non-increasing and the sequence

{x_{k}}

generated by Algorithm 1 is contained in the level set

L

. Since f is continuously differentiable and the set

L

is bounded, then there exists a constant

f^{*}

such that

lim_{k \to \infty} f (x_{k}) = f^{*} .

The Zoutendijk condition [35] plays an essential role in the global convergence of nonlinear CG methods. For completeness, we here state the lemma but omit its proof.

Lemma 1.

Suppose that Assumptions 1 and 2 hold. Consider any nonlinear CG method, in which

α_{k}

is obtained by the standard Wolfe line search (13) and

d_{k}

is a descent direction (

g_{k}^{T} d_{k} < 0

). Then, we have

\sum_{k = 1}^{\infty} \frac{{(g_{k}^{T} d_{k})}^{2}}{∥ d_{k} ∥^{2}} < \infty .

(14)

Thereafter, the convergence property is presented in the following theorem for the general functions without convexity assumption.

Theorem 2.

Let Assumptions 1 and 2 hold and the sequence

{x_{k}}

be generated by the NMHSDY algorithm. Set

l_{k + 1} = \frac{β_{k + 1}^{N}}{β_{k + 1}^{D Y}}

, and if

l_{k + 1} \in [- \frac{1 - σ_{2}}{1 + σ_{2}}, 1 - σ_{2}]

holds, then we have

\underset{k \to \infty}{lim inf} ∥ g_{k} ∥ = 0 .

(15)

Proof.

We now prove (15) by contradiction and assume that there exists a constant

μ > 0

such that

∥ g_{k} ∥ \geq μ, \forall k \geq 0 .

(16)

Let

γ_{k + 1}

be

1 + β_{k + 1}^{N} \frac{g_{k + 1}^{T} d_{k}}{∥ g_{k + 1} ∥^{2}}

, then the direction (8) can be rewritten as

d_{k + 1} = - γ_{k + 1} g_{k + 1} + β_{k + 1}^{N} d_{k} .

After some algebraic manipulation, we have

∥ d_{k + 1} ∥^{2} = {(β_{k + 1}^{N})}^{2} ∥ d_{k} ∥^{2} - 2 γ_{k + 1} g_{k + 1}^{T} d_{k + 1} - γ_{k + 1}^{2} {∥ g_{k + 1} ∥}^{2} .

Dividing both sides of the above equality by

{(g_{k + 1}^{T} d_{k + 1})}^{2}

, from (9), we have

\begin{matrix} \frac{∥ d_{k + 1} ∥^{2}}{∥ g_{k + 1} ∥^{4}} & = & {(β_{k + 1}^{N})}^{2} \frac{∥ d_{k} ∥^{2}}{∥ g_{k + 1} ∥^{4}} + \frac{2 γ_{k + 1}}{∥ g_{k + 1} ∥^{2}} - \frac{γ_{k + 1}^{2}}{∥ g_{k + 1} ∥^{2}} \\ = & {(β_{k + 1}^{N})}^{2} \frac{∥ d_{k} ∥^{2}}{∥ g_{k + 1} ∥^{4}} + \frac{1}{∥ g_{k + 1} ∥^{2}} - \frac{{(γ_{k + 1} - 1)}^{2}}{∥ g_{k + 1} ∥^{2}}, \\ = & l_{k + 1}^{2} \frac{∥ d_{k} ∥^{2}}{{(d_{k}^{T} y_{k})}^{2}} + \frac{1}{∥ g_{k + 1} ∥^{2}} - \frac{{(γ_{k + 1} - 1)}^{2}}{∥ g_{k + 1} ∥^{2}}, \\ \leq & \frac{l_{k + 1}^{2}}{{(1 - σ_{2})}^{2}} \frac{∥ d_{k} ∥^{2}}{∥ g_{k} ∥^{4}} + \frac{1}{∥ g_{k + 1} ∥^{2}} \leq \frac{∥ d_{k} ∥^{2}}{∥ g_{k} ∥^{4}} + \frac{1}{∥ g_{k + 1} ∥^{2}}, \end{matrix}

(17)

where the first inequality holds by

d_{k}^{T} y_{k} \geq (σ_{2} - 1) g_{k}^{T} d_{k} = (1 - σ_{2}) {∥ g_{k} ∥}^{2}

and the last inequality holds by the bound for the scale

l_{k + 1}

. By (17) and

∥ d_{0} ∥^{2} = {∥ g_{0} ∥}^{2}

, it holds that

\frac{∥ d_{k} ∥^{2}}{∥ g_{k} ∥^{4}} \leq \sum_{i = 0}^{k} \frac{1}{∥ g_{i} ∥^{2}} .

Then, by the above inequality and (16), it follows that

\frac{∥ g_{k} ∥^{4}}{∥ d_{k} ∥^{2}} \geq \frac{μ^{2}}{k + 1},

which indicates that

\sum_{k = 1}^{\infty} \frac{∥ g_{k} ∥^{4}}{∥ d_{k} ∥^{2}} = \sum_{k = 1}^{\infty} \frac{{(g_{k}^{T} d_{k})}^{2}}{∥ d_{k} ∥^{2}} = \infty,

which contradicts the Zoutendijk condition (14). So, (15) holds. □

Remark 1.

In [24], the authors presented a class of hybrid conjugate parameters, one of which is

β_{k + 1} = max {0, min {β_{k + 1}^{D Y}, β_{k + 1}^{H S}}}

, with the corresponding interval for

l_{k + 1}

being

[- (1 - σ_{2}) / (1 + σ_{2}), 1]

. It is reasonable that the interval in our paper is smaller since we take

β_{k + 1}^{M H S}

instead of

β_{k + 1}^{H S}

and

0 < ϑ_{k} \leq 1

.

In the following, we discuss the global convergence of Algorithm 1 for general nonlinear functions in the case of

l_{k + 1} \notin [- \frac{1 - σ_{2}}{1 + σ_{2}}, 1 - σ_{2}]

. Motivated by the modified secant conditions in [36,37], in this part, based on the Wolfe line search strategy (13), we consider the following settings:

{\bar{y}}_{k} = y_{k} + m s_{k},

(18)

where

m > 0

is a constant. With the above setting, the modified conjugate parameter becomes

β_{k + 1}^{N N} = max \{0, min \{β_{k + 1}^{N D Y}, β_{k + 1}^{N M H S}\}\},

(19)

where

β_{k + 1}^{N D Y}

and

β_{k + 1}^{N M H S}

are, respectively,

β_{k + 1}^{N D Y} = \frac{∥ g_{k + 1} ∥^{2}}{{\bar{y}}_{k}^{T} s_{k}}, β_{k + 1}^{N M H S} = \frac{g_{k + 1}^{T} {\bar{y}}_{k}}{{\bar{y}}_{k}^{T} s_{k}} (1 - \frac{{(g_{k + 1}^{T} s_{k})}^{2}}{∥ g_{k + 1} ∥^{2} {∥ s_{k} ∥}^{2}}) .

Meanwhile, the corresponding direction turns to

d_{k + 1}^{N} = - (1 + β_{k + 1}^{N N} \frac{g_{k + 1}^{T} s_{k}}{∥ g_{k + 1} ∥^{2}}) g_{k + 1} + β_{k + 1}^{N N} s_{k},

(20)

The following lemma indicates the property of the scalar

{\bar{y}}_{k}^{T} s_{k} s_{k}

and

∥ {\bar{y}}_{k} ∥

.

Lemma 2.

Let

{\bar{y}}_{k}

be defined by (18); then, adopting the Wolfe line search strategy (13), we obtain

{\bar{y}}_{k}^{T} s_{k} \geq m {∥ s_{k} ∥}^{2},

(21)

and

∥ {\bar{y}}_{k} ∥ \leq (L_{1} + m) ∥ s_{k} ∥ .

(22)

Proof.

By the Wolfe line search strategy (13), we have

y_{k}^{T} s_{k} = {(g_{k + 1} - g_{k})}^{T} s_{k} \geq (σ_{2} - 1) g_{k}^{T} s_{k} \geq (1 - σ_{2}) α_{k} {∥ g_{k} ∥}^{2} \geq 0,

which indicates

y_{k}^{T} s_{k} \geq 0

. Therefore, it holds that

{\bar{y}}_{k}^{T} s_{k} = y_{k}^{T} s_{k} + m ∥ s_{k} ∥^{2} \geq m {∥ s_{k} ∥}^{2} .

Hence, (21) holds. Meanwhile, we also have

∥ {\bar{y}}_{k} ∥ = ∥ y_{k} + m s_{k} ∥ \leq ∥ y_{k} ∥ + m ∥ s_{k} ∥ \leq (L_{1} + m) ∥ s_{k} ∥,

where the second inequality holds by Assumption 2. Hence, (22) holds. This completes the proof. □

In the following, we assume that Algorithm 1 never stops and there exists a constant

μ > 0

such that for all k, (16) holds.

Lemma 3.

Suppose that Assumptions 1 and 2 and (16) hold. The sequences

{x_{k}}

and

{d_{k}^{N}}

are generated by Algorithm 1 with the conjugate parameter

β_{k}^{N N}

and adopting the Wolfe line search technique (13). Then, there exists a positive constant Γ such that

∥ g_{k} ∥ \leq ∥ d_{k}^{N} ∥ \leq (1 + 2 Γ) ∥ g_{k} ∥ .

(23)

Proof.

Based on the Wolfe line search technique (13), it holds that

{\bar{y}}_{k}^{T} d_{k}^{N} = y_{k}^{T} d_{k}^{N} + m α_{k} ∥ d_{k}^{N} ∥^{2} \geq y_{k}^{T} d_{k}^{N} = - (1 - σ_{2}) g_{k}^{T} d_{k}^{N} = (1 - σ_{2}) {∥ g_{k} ∥}^{2},

(24)

where the first inequality holds by the non-negativity of

α_{k}

and the last inequality holds by the sufficient descent property (9). Meanwhile, by (9) and the Cauchy–Schwartz inequality, it holds that, for

\forall k \geq 0

,

∥ g_{k} ∥ ∥ d_{k}^{N} ∥ \geq - g_{k}^{T} d_{k}^{N} = {∥ g_{k} ∥}^{2},

which implies that from condition (16),

∥ d_{k}^{N} ∥ \geq ∥ g_{k} ∥ \geq μ, \forall k \geq 0 .

(25)

By the definition of

β_{k + 1}^{N N}

, we obtain that

\begin{matrix} | β_{k + 1}^{N N} | & \leq & max \{\frac{∥ g_{k + 1} ∥^{2}}{| {\bar{y}}_{k}^{T} s_{k} |}, \frac{| g_{k + 1}^{T} {\bar{y}}_{k} |}{| {\bar{y}}_{k}^{T} s_{k} |}\} \leq max \{\frac{∥ g_{k + 1} ∥^{2}}{m ∥ s_{k} ∥^{2}}, \frac{∥ g_{k + 1} ∥ ∥ {\bar{y}}_{k} ∥}{m ∥ s_{k} ∥^{2}}\} \\ \leq & \frac{∥ g_{k + 1} ∥}{∥ s_{k} ∥} max \{\frac{∥ g_{k + 1} ∥}{m ∥ s_{k} ∥}, \frac{L_{1} + m}{m}\} \leq \frac{∥ g_{k + 1} ∥}{∥ s_{k} ∥} max \{\frac{M}{m \bar{α} ∥ d_{k}^{N} ∥}, \frac{L_{1} + m}{m}\} \\ \leq & \frac{∥ g_{k + 1} ∥}{∥ s_{k} ∥} max \{\frac{M}{m \bar{α} ∥ g_{k} ∥}, \frac{L_{1} + m}{m}\} \\ \leq & \frac{∥ g_{k + 1} ∥}{∥ s_{k} ∥} max \{\frac{M}{m \bar{α} μ}, \frac{L_{1} + m}{m}\} : = Γ \frac{∥ g_{k + 1} ∥}{∥ s_{k} ∥}, \end{matrix}

where the second inequality holds by (21), the third inequality holds by (22), the fourth inequality holds by the condition

α_{k} \geq α > 0

for all

k \geq 0

, the fifth inequality holds by (25), and the last inequality holds by the condition (16). Furthermore, we have

|β_{k + 1}^{N N} \frac{g_{k + 1}^{T} s_{k}}{∥ g_{k + 1} ∥^{2}}| \leq | β_{k + 1}^{N N} | \frac{∥ g_{k + 1} ∥ ∥ s_{k} ∥}{∥ g_{k + 1} ∥^{2}} \leq Γ \frac{∥ g_{k + 1} ∥}{∥ s_{k} ∥} \frac{∥ s_{k} ∥}{∥ g_{k + 1} ∥} = Γ .

By the definition of

d_{k}^{N}

in (20) and the above discussions, it holds that

\begin{matrix} ∥ d_{k + 1}^{N} ∥ & \leq & ∥ g_{k + 1} ∥ + |β_{k + 1}^{N N} \frac{g_{k + 1}^{T} s_{k}}{∥ g_{k + 1} ∥^{2}}| ∥ g_{k + 1} ∥ + | ∥_{k + 1}^{N N} | ∥ s_{k} ∥ \\ \leq & ∥ g_{k + 1} ∥ + Γ ∥ g_{k + 1} ∥ + Γ \frac{∥ g_{k + 1} ∥}{∥ s_{k} ∥} ∥ s_{k} ∥ \\ = & (1 + 2 Γ) ∥ g_{k + 1} ∥ . \end{matrix}

With the help of (25), we conclude that

∥ g_{k + 1} ∥ \leq ∥ d_{k + 1}^{N} ∥ \leq (1 + 2 Γ) ∥ g_{k + 1} ∥ .

Hence, (23) holds. This completes the proof. □

Theorem 3.

Suppose that Assumptions 1 and 2 hold. The sequences

{x_{k}}

and

{d_{k}^{N}}

are generated by Algorithm 1 with the conjugate parameter

β_{k}^{N N}

and the Wolfe line search technique (13) is adopted. Then, Algorithm 1 converges in the sense of (15).

Proof.

We prove the conclusion by contradiction and assume that there exists a positive constant

μ

such that (16) holds. Otherwise, Algorithm 1 converges in the sense of (15). From (9), we conclude that the new direction enjoys the sufficient descent property. Therefore, Lemma 1 holds, which implies that

+ \infty = \sum_{k = 1}^{+ \infty} \frac{μ^{2}}{{(1 + 2 Γ)}^{2}} \leq \sum_{k = 1}^{+ \infty} \frac{∥ g_{k} ∥^{2}}{{(1 + 2 Γ)}^{2}} = \sum_{k = 1}^{+ \infty} \frac{∥ g_{k} ∥^{4}}{{(1 + 2 Γ)}^{2} {∥ g_{k} ∥}^{2}} \leq \sum_{k = 1}^{+ \infty} \frac{{(g_{k}^{T} d_{k}^{N})}^{2}}{∥ d_{k}^{N} ∥^{2}} < + \infty,

where the first inequality holds by (16), the second inequality holds by (9) and (23), and the last inequality holds by Lemma 1. However, that is a contradiction and the assumption does not hold. So, the

{lim inf}_{k \to + \infty} ∥ g_{k} ∥ = 0

holds. This completes the proof. □

4. Numerical Performance

In this section, we focus on the numerical performance of Algorithm 1 and compare it with several effective CG methods. In the experiment, we code these algorithms in Matlab 2016b and perform them on a PC computer, whose processor has AMD 2.10 GHz, RAM of 16.00 GB and the Windows 10 operating system.

4.1. Performance on Benchmark Problems

In this subsection, we check the performance of the NMHSDY method and compare it with two effective modified HS methods in [26,28] and the hybrid method in [24]. In [26], the authors proposed an effective modified HS method (MHSCG method for abbreviation) in which the conjugate parameter is

β_{k + 1} = max {0, {\bar{β}}_{k + 1}^{M H S}}, {\bar{β}}_{k + 1}^{M H S} = β_{k + 1}^{M H S} - λ {(\frac{∥ y_{k} ∥ ϑ_{k}}{y_{k}^{T} d_{k}})}^{2} g_{k + 1}^{T} d_{k},

where

λ > 1 / 4

is a parameter. The direction in [26] is in the form of (3) and the corresponding method has attractive numerical performance. Dai and Kou in [28] introduced another effective class of CG schemes (DK+ method for abbreviation) depending on the parameter

τ_{k}

, where the corresponding conjugate parameter

β_{k + 1}

is defined by

β_{k + 1}^{D K} (τ_{k}) = β_{k + 1}^{H S} - (τ_{k} + \frac{∥ y_{k} ∥^{2}}{y_{k}^{T} s_{k}} - \frac{y_{k}^{T} s_{k}}{∥ s_{k} ∥^{2}}) \frac{g_{k + 1}^{T} s_{k}}{y_{k}^{T} d_{k}}, s_{k} = x_{k + 1} - x_{k} .

The direction in [28] is also in the form of (3). To establish global convergence for general nonlinear functions, a truncated strategy is used, that is,

β_{k + 1}^{D K +} (τ_{k}) = max \{β_{k + 1}^{D K} (τ_{k}), η \frac{g_{k + 1}^{T} d_{k}}{∥ d_{k} ∥^{2}}\},

where

η \in [0, 1)

is a parameter. The numerical results indicated the DK+ method has good and reliable numerical performance. Dai and Yuan, in [24], proposed an effective hybrid CG method (HSDY method for abbreviation) in which the conjugate parameter is

β_{k + 1} = max {0, min {β_{k + 1}^{D Y}, β_{k + 1}^{H S}}} .

The hybrid method also has global convergence and attractive numerical performance.

In the following, we focus on the numerical performance and the large-scale unconstrained problems in Table 1 (see [38] for details). In order to improve numerical performance, Andrei, in [39], proposed an accelerated strategy which modified the step in a multiplicative manner. In this part, we also utilize this strategy and regard Algorithm 1 with the accelerated strategy as Algorithm 1. To compare the conjugate parameters and the search directions fairly, here we adopt the Wolfe line search technique (13) for all methods.

In the experiment, for each problem we consider nine large-scale dimensions with 300, 600, 900, 3000, 6000, 9000, 30,000, 60,000 and 90,000 variables. The parameters used in the Wolfe line search are

σ_{1} = 0.20

and

σ_{2} = 0.85

. The other parameters for the MHSCG method and the DK+ method are as default.

During the progress, the Himmeblau stopping rule is adopted: if

| f (x_{k}) | > ε_{1}

, let

s t o p 1 = \frac{| f (x_{k}) - f (x_{k + 1}) |}{| f (x_{k}) |}

, otherwise,

s t o p 1 = | f (x_{k}) - f (x_{k - 1}) |

. If the conditions

∥ g_{k} ∥ \leq ε

or

s t o p 1 \leq ε_{2}

are satisfied, then the progress is stopped, where the values of parameters

ε

,

ε_{1}

, and

ε_{2}

are

ε = 10^{- 6}

, and

ε_{1} = ε_{2} = 10^{- 5}

. Meanwhile, we also stop the algorithm when the number of iterations is greater than 5000.

In order to present the performances of methods more intuitively, the tool in [40] is adopted to analyze the profiles of these methods. Robustness and efficiency rates are readable on the right and left vertical axes of the corresponding performance profiles, respectively. To present a detailed numerical comparison, two different scales have been considered for the

τ

-axis. One is

τ \in [1, 1.5]

, which shows what happens for the values of

τ

near to 1. The other is used to present the trend for large values of

τ

. In Figure 1, Figure 2 and Figure 3, we, respectively, show the performance of these methods relative to the number of iterations (

N I

), the number of function-gradient valuations (

N F G

; which is the sum of the number of function valuations and gradient valuations), and the CPU time consumed in seconds.

From Figure 1, Figure 2 and Figure 3, we have that Algorithm 1 is comparable and a little more effective than the HSDY method, the DK+ method, and the MHSCG method for the above problems. Meanwhile, Algorithm 1, with the accelerated strategy, is much effective and performs best in the experiment, which indicates that the accelerated technique indeed works and reduces the number of iterations and the number of function and gradient evaluations.

4.2. Comparison for Stability

In this subsection, we consider the numerical stability of Algorithm 1 for the ill-conditioned matrix problems and compare it with the MHSCG method in [26]. In fact, the quadratic objective function

f (x) = x^{T} I x

of (1) is ill-conditioned if matrix

I \in R^{n \times n}

is in the form

I_{i, j} = \frac{1}{i + j - 1}, i, j = 1, 2, \dots, n .

It is clear that the matrix

I

is ill-conditioned and positive definite [41], with different dimensions

n = 5, 6, \dots, 50

. Furthermore, the authors in [42] show that the

ℓ_{\infty}

norm condition number of the Hessian matrix

I

gradually increases from

9.4366 \times 10^{5}

for

n = 5

to

6.9007 \times 10^{20}

for

n = 50

. In the following, we explore the numerical performance. The experimental environment, the parameter values, and the stop rule remain the same as in the above subsection. Meanwhile, the initial point is selected as

x_{0} = (10, 10, \dots, 10)

. The corresponding numerical results are presented in Table 2 and Table 3, in which

D i m

is the dimension of x,

N I

means the number of iterations,

N F G

is the sum of the number of function and gradient evaluations,

T i m e

means the CPU time consumed in seconds, and

f^{*}

denotes the optimal function obtained by the methods.

From Table 2 and Table 3, it can be found that for the dimensions from 5 to 50, Algorithm 1 and the MHSCG method successfully solve all of them and obtain reasonable optimal function values, which are all not greater than

10^{- 5}

. For most problems, Algorithm 1 needed fewer iterations and function and gradient evaluations and obtained better optimal values. In order to show numerical performance intuitively, here we also adopt the performance profiles in [40] for the NI and NFG cases. The corresponding performance profiles are given in Figure 4 and Figure 5.

Figure 4 shows that Algorithm 1 and the MHSCG method solve these testing problems with the least total number of iterations in 63% and 48% of cases, respectively. Figure 5 indicates that Algorithm 1 and the MHSCG method solve these testing problems with the least total number of function and gradient evaluations in 61% and 50% of cases, respectively. All in all, the numerical results show that Algorithm 1 is more effective and stable than the MHSCG method for these ill-conditioned matrix problems.

4.3. Application to Image Restoration

In this subsection, we apply Algorithm 1 to some image restoration problems [43,44,45]. During the process, the normal Wolfe line search technique is adopted, the corresponding parameters remain unchanged, and two noise level cases for the Barbara.png (

512 \times 512

) and Baboon.bmp (

512 \times 512

) images are considered. In this part, we stop the process when the following criteria are both satisfied:

\frac{| f (x_{k + 1}) - f (x_{k}) |}{| f (x_{k}) |} < 10^{- 3}, ∥ g (x_{k}) ∥ < 10^{- 3} (1 + | f (x_{k}) |) .

Meanwhile, to assess the restoration performance qualitatively, we also utilize the peak signal to noise ratio [45] (PSNR), which is defined by

P S N R = 10 {log}_{10} \frac{255^{2}}{\frac{1}{M \times N} \sum_{i, j} {(u_{i, j}^{r} - u_{i, j}^{*})}^{2}},

where M and N are the true image pixels, and

u_{i, j}^{r}

and

u_{i, j}^{*}

denote the pixel values of the restored image and the original image, respectively. For the noise level, we consider two cases:

20 %

(a low-level case) and

60 %

(a high-level case). The consumed CPU time and the corresponding PSNR values are given in Table 4. Meanwhile, the detailed performances are presented in Figure 6 and Figure 7, respectively.

From Table 4 and Figure 6 and Figure 7, we have that Algorithm 1 and the MHSCG method can all solve the image restoration problems successfully within a suitable time, and Algorithm 1 seems to perform a little better than the MHSCG method.

5. Conclusions

Conjugate gradient methods are attractive and effective for large-scale unconstrained optimization smooth problems due to their simple computation and low memory requirements. The Dai–Yuan conjugate gradient method has good theoretical properties and generates a descent direction in each iteration. Whereas, the Hestenes–Stiefel conjugate gradient method automatically satisfies the conjugate condition

y_{k}^{T} d_{k + 1} = 0

without any line search technique and performs well in practice. By the above discussions, we propose a new descent hybrid conjugate gradient method. The proposed method has a sufficient descent property independent of any line search technique. Under some mild conditions, the proposed method is globally convergent. In the experiments, we first consider 61 unconstrained problems with 9 different dimensions up to

90, 000

. Thereafter, 46 ill-conditioned matrix problems are also tested. The primary numerical results show that the proposed method is more effective and stable. Finally, we apply the hybrid method to some image restoration problems. The results indicate our method is attractive and reliable.

Author Contributions

Conceptualization, S.W., X.W., Y.T. and L.P.; methodology, S.W. and X.W.; software, X.W.; validation, X.W., L.P. and Y.T.; formal analysis, X.W., Y.T. and L.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported by Science Foundation of Zhejiang Sci-Tech University (ZSTU) under Grant No. 21062347-Y.

Data Availability Statement

All data included in this study are available upon reasonable request.

Conflicts of Interest

The authors declare no competing interests.

References

Li, D.; Fukushima, M. A global and superlinear convergent Gauss-Newton-based BFGS method for symmetric nonlinear equations. SIAM J. Numer. Anal. 1999, 37, 152–172. [Google Scholar] [CrossRef]
Yuan, G.; Wei, Z.; Wang, Z. Gradient trust region algorithm with limited memory BFGS update for nonsmooth convex minimization. Comput. Optim. Appl. 2013, 54, 45–64. [Google Scholar] [CrossRef]
Dai, Y.; Han, J.; Liu, G.; Sun, D.; Yin, H.; Yuan, Y. Convergence properties of nonlinear conjugate gradient methods. SIAM J. Optim. 2000, 10, 345–358. [Google Scholar] [CrossRef]
Hager, W.; Zhang, H. A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM J. Optim. 2005, 16, 170–192. [Google Scholar] [CrossRef]
Fletcher, R.; Reeves, C.M. Function minimization by conjugate gradients. Comput. J. 1964, 7, 149–154. [Google Scholar] [CrossRef]
Dai, Y.; Yuan, Y. A nonlinear conjugate gradient method with a strong global convergence property. SIAM J. Optim. 1999, 10, 177–182. [Google Scholar] [CrossRef]
Fletcher, R. Practical Methods of Optimization; Unconstrained Optimization; John Wiley & Sons: New York, NY, UAS, 1987; Volume 1. [Google Scholar]
Hestenes, R.; Stiefel, L. Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. 1952, 49, 409–436. [Google Scholar] [CrossRef]
Polyak, B.T. The conjugate gradient method in extreme problems. USSR Comput. Math. Math. Phys. 1969, 9, 94–112. [Google Scholar] [CrossRef]
Polak, E.; Ribière, G. Note sur la convergence de méthodes de directions conjuguées. Rev. Fr. Informat Rech. Opér. 1969, 16, 35–43. [Google Scholar] [CrossRef]
Liu, Y.; Storey, C. Efficient generalized conjugate gradient algorithms Part 1: Theory. J. Optim. Theory Appl. 1991, 69, 129–137. [Google Scholar] [CrossRef]
Xiang, Y.; Gong, X.G. Efficiency of generalized simulated annealing. Phys. Rev. E 2000, 62, 4473–4476. [Google Scholar] [CrossRef] [PubMed]
Yuan, Q.; Qian, F. A hybrid genetic algorithm for twice continuously differentiable NLP problems. Comput. Chem. Eng. 2010, 34, 36–41. [Google Scholar] [CrossRef]
Gao, F.C.; Han, L.X. Implementing the Nelder-Mead simplex algorithm with adaptive parameters. Comput. Optim. Appl. 2012, 51, 259–277. [Google Scholar] [CrossRef]
Yuan, G.; Wang, X.; Sheng, Z. The projection technique for two open problems of unconstrained optimization problems. J. Optim. Theory Appl. 2020, 186, 590–619. [Google Scholar] [CrossRef]
Yuan, G.; Wang, X.; Sheng, Z. Family weak conjugate gradient algorithms and their convergence analysis for nonconvex functions. Numer. Algorithms 2020, 84, 935–956. [Google Scholar] [CrossRef]
Mousavi, A.; Esmaeilpour, M.; Sheikhahmadi, A. A new family of Polak-Ribière-Polyak conjugate gradient method for impulse noise removal. Soft Comput. 2023, 27, 17515–17524. [Google Scholar] [CrossRef]
Polyak, B.T. Introduction to Optimization; Optimization Software Inc., Publications Division: New York, NY, USA, 1987. [Google Scholar]
Wang, X.; Yuan, G.; Pang, L. A class of new three-term descent conjugate gradient algorithms for large-scale unconstrained optimization and applications to image restoration problems. Numer. Algorithms 2023, 93, 949–970. [Google Scholar] [CrossRef]
Wang, X. A class of spectral three-term descent Hestenes-Stiefel conjugate gradient algorithms for large-scale unconstrained optimization and image restoration problems. Appl. Numer. Math. 2023, 192, 41–56. [Google Scholar] [CrossRef]
Touati-Ahmed, D.; Storey, C. Efficient hybrid conjugate gradient techniques. J. Optim.Theory Appl. 1990, 64, 379–397. [Google Scholar] [CrossRef]
Zhang, L.; Zhou, W. Two descent hybrid conjugate gradient methods for optimization. J. Comput. Appl. Math. 2008, 216, 251–264. [Google Scholar] [CrossRef]
Gilbert, J.; Nocedal, J. Global convergence properties of conjugate gradient methods for optimization. SIAM J. Optim. 1992, 2, 21–42. [Google Scholar] [CrossRef]
Dai, Y.; Yuan, Y. An efficient hybrid conjugate gradient method for unconstrained optimization. Ann. Oper. Res. 2001, 103, 33–47. [Google Scholar] [CrossRef]
Jiang, X.; Liao, W.; Yin, J.; Jian, J. A new family of hybrid three-term conjugate gradient methods with applications in image restoration. Numer. Algorithms 2022, 91, 161–191. [Google Scholar] [CrossRef]
Amini, K.; Faramarzi, P.; Pirfalah, N. A modified Hestenes-Stiefel conjugate gradient method with an optimal property. Optim. Methods Softw. 2019, 34, 770–782. [Google Scholar] [CrossRef]
Narushima, Y.; Yabe, H.; Ford, J. A three-term conjugate gradient method with sufficient descent property for unconstrained optimization. SIAM J. Optim. 2011, 21, 212–230. [Google Scholar] [CrossRef]
Dai, Y.; Kou, C. A nonlinear conjugate gradient algorithm with an optimal property and an improved Wolfe line search. SIAM J. Optim. 2013, 23, 296–320. [Google Scholar] [CrossRef]
Woldu, T.; Zhang, H.; Zhang, X.; Yemane, H. A modified nonlinear conjugate gradient algorithm for large-scale nonsmooth convex optimization. J. Optim. Theory Appl. 2020, 185, 223–238. [Google Scholar] [CrossRef]
Yuan, G.; Meng, Z.; Li, Y. A modified Hestenes and Stiefel conjugate gradient algorithm for large-scale nonsmooth minimizations and nonlinear equations. J. Optim. Theory Appl. 2016, 168, 129–152. [Google Scholar] [CrossRef]
Babaie-Kafaki, S.; Fatemi, M.; Mahdavi-Amiri, N. Two effective hybrid conjugate gradient algorithms based on modified BFGS updates. Numer. Algorithms 2011, 58, 315–331. [Google Scholar] [CrossRef]
Livieris, I.; Tampakas, V.; Pintelas, P. A descent hybrid conjugate gradient method based on the memoryless BFGS update. Numer. Algorithms 2018, 79, 1169–1185. [Google Scholar] [CrossRef]
Khoshgam, Z.; Ashrafi, A. A new hybrid conjugate gradient method for large-scale unconstrained optimization problem with non-convex objective function. Comp. Appl. Math. 2019, 38, 186. [Google Scholar] [CrossRef]
Narayanan, S.; Kaelo, P. A linear hybridization of Dai-Yuan and Hestenes-Stiefel conjugate gradient method for unconstrained optimization. Numer.-Math.-Theory Methods Appl. 2021, 14, 527–539. [Google Scholar]
Zoutendijk, G. Nonlinear Programming, Computational Methods; Integer & Nonlinear Programming: Amsterdam, The Netherlands, 1970; pp. 37–86. [Google Scholar]
Li, D.; Fukushima, M. A modified BFGS method and its global convergence in nonconvex minimization. J. Comput. Appl. Math. 2001, 129, 15–35. [Google Scholar] [CrossRef]
Babaie-Kafaki, S.; Ghanbari, R. A modified scaled conjugate gradient method with global convergence for nonconvex functions. B Bull. Belg. Math. Soc. Simon Stevin 2014, 21, 465–477. [Google Scholar] [CrossRef]
Andrei, N. An unconstrained optimization test functions collection. Environ. Ence Technol. 2008, 10, 6552–6558. [Google Scholar]
Andrei, N. An acceleration of gradient descent algorithm with backtracking for unconstrained optimization. Numer. Algorithms 2006, 42, 63–73. [Google Scholar] [CrossRef]
Dolan, E.; Moré, J. Benchmarking optimization software with performance profiles. Math. Program 2002, 91, 201–213. [Google Scholar] [CrossRef]
Watkins, S. Fundamentals of Matrix Computations; John Wiley and Sons: New York, NY, USA, 2002. [Google Scholar]
Babaie-Kafaki, S. A hybrid scaling parameter for the scaled memoryless BFGS method based on the ℓ_∞ matrix norm. Int. J. Comput. Math. 2019, 96, 1595–1602. [Google Scholar] [CrossRef]
Yu, G.; Huang, J.; Zhou, Y. A descent spectral conjugate gradient method for impulse noise removal. Appl. Math. Lett. 2010, 23, 555–560. [Google Scholar] [CrossRef]
Yuan, G.; Lu, J.; Wang, Z. The PRP conjugate gradient algorithm with a modified WWP line search and its application in the image restoration problems. Appl. Numer. Math. 2020, 152, 1–11. [Google Scholar] [CrossRef]
Bovik, A. Handbook of Image and Video Processing; Academic: New York, NY, USA, 2000. [Google Scholar]

Figure 1. Performance profiles of the methods in the number of iterations case.

Figure 2. Performance profiles of the methods in the function and gradient case.

Figure 3. Performance profiles of the methods in the CPU time consumed case.

Figure 4. Performance profiles of Algorithm 1 and the MHSCG method in NI case.

Figure 5. Performance profiles of Algorithm 1 and the MHSCG method in NFG case.

Figure 6. The noisy Barbara image, corrupted by salt-and pepper noise (the first column); the images restored via Algorithm 1 (the second column), and via the MHSCG method (the third column).

Figure 7. The noisy Baboon image, corrupted by salt-and pepper noise (the first column); the images restored via Algorithm 1 (the second column), and via the MHSCG method (the third column).

Table 1. The test problems.

No.	Problem	No.	Problem
1	Extended Freudenstein and Roth Function	32	ARWHEAD (CUTE)
2	Extended Trigonometric Function	33	NONDIA (Shanno-78) (CUTE)
3	Extended Rosenbrock Function	34	DQDRTIC (CUTE)
4	Extended Beale Function	35	EG2 (CUTE)
5	Extended Penalty Function	36	DIXMAANA (CUTE)
6	Perturbed Quadratic Function	37	DIXMAANB (CUTE)
7	Raydan 1 Function	38	DIXMAANC (CUTE)
8	Raydan 2 Function	39	DIXMAANE (CUTE)
9	Diagonal 3 Function	40	Broyden Tridiagonal
10	Generalized Tridiagonal-1 Function	41	Almost Perturbed Quadratic
11	Extended Tridiagonal-1 Function	42	Tridiagonal Perturbed Quadratic
12	Extended Three Exponential Terms	43	EDENSCH Function (CUTE)
13	Generalized Tridiagonal-2	44	VARDIM Function (CUTE)
14	Diagonal 4 Function	45	LIARWHD (CUTE)
15	Diagonal 5 Function	46	DIAGONAL 6
16	Extended Himmelblau Function	47	DIXMAANF (CUTE)
17	Generalized PSC1 Function	48	DIXMAANG (CUTE)
18	Extended PSC1 Function	49	DIXMAANH (CUTE)
19	Extended Powell Function	50	DIXMAANI (CUTE)
20	Extended Cliff Function	51	DIXMAANJ (CUTE)
21	Quadratic Diagonal Perturbed Function	52	DIXMAANK (CUTE)
22	Extended Wood Function	53	DIXMAANL (CUTE)
23	Extended Hiebert Function	54	DIXMAAND (CUTE)
24	Quadratic Function QF1	55	ENGVAL1 (CUTE)
25	Extended Quadratic Penalty QP1 Function	56	COSINE (CUTE)
26	Extended Quadratic Penalty QP2 Function	57	Extended DENSCHNB (CUTE)
27	A Quadratic Function QF2 Function	58	Extended DENSCHNF (CUTE)
28	Extended EP1 Function	59	SINQUAD (CUTE)
29	Extended Tridiagonal-2 Function	60	Scaled Quadratic SQ1
30	BDQRTIC (CUTE)	61	Scaled Quadratic SQ2
31	TRIDIA (CUTE)

Table 2. Numerical results of the MHSCG method and Algorithm 1 in 5–40 dimensions.

	MHSCG Method				Algorithm 1
Dim	NI	NFG	Time	$f^{*}$	NI	NFG	Time	$f^{*}$
5	5	22	0.000000	0.000000	5	22	0.000000	0.000000
6	13	48	0.031250	0.000000	13	48	0.000000	0.000000
7	21	72	0.000000	0.000000	21	72	0.000000	0.000000
8	21	72	0.000000	0.000000	21	72	0.015625	0.000000
9	27	92	0.000000	0.000006	24	82	0.000000	0.000000
10	41	143	0.000000	0.000007	29	97	0.000000	0.000000
11	75	246	0.000000	0.000009	55	192	0.000000	0.000000
12	58	203	0.000000	0.000003	70	249	0.000000	0.000008
13	57	190	0.000000	0.000004	70	246	0.000000	0.000000
14	86	276	0.031250	0.000008	123	429	0.000000	0.000000
15	75	264	0.031250	0.000009	89	314	0.000000	0.000009
16	98	330	0.000000	0.000008	52	177	0.000000	0.000001
17	24	82	0.000000	0.000002	66	232	0.000000	0.000002
18	65	215	0.031250	0.000009	28	97	0.000000	0.000000
19	31	105	0.000000	0.000007	27	94	0.000000	0.000000
20	111	373	0.031250	0.000009	131	452	0.000000	0.000010
21	117	379	0.031250	0.000008	31	108	0.000000	0.000000
22	96	333	0.000000	0.000010	66	236	0.000000	0.000000
23	68	233	0.000000	0.000009	85	303	0.031250	0.000000
24	50	175	0.031250	0.000000	43	151	0.000000	0.000000
25	96	318	0.000000	0.000009	103	377	0.000000	0.000000
26	113	370	0.000000	0.000009	112	410	0.000000	0.000004
27	22	78	0.000000	0.000000	129	473	0.031250	0.000001
28	104	372	0.000000	0.000009	24	87	0.000000	0.000000
29	156	513	0.000000	0.000010	130	449	0.000000	0.000001
30	54	185	0.000000	0.000001	121	446	0.000000	0.000007
31	39	144	0.000000	0.000003	97	341	0.000000	0.000008
32	26	97	0.000000	0.000001	79	292	0.031250	0.000001
33	87	309	0.000000	0.000010	66	234	0.031250	0.000002
34	58	213	0.000000	0.000006	60	217	0.031250	0.000000
35	97	327	0.000000	0.000009	25	95	0.000000	0.000002
36	24	86	0.000000	0.000000	132	481	0.000000	0.000003
37	23	84	0.000000	0.000004	55	192	0.000000	0.000001
38	93	312	0.046875	0.000010	40	142	0.000000	0.000003
39	148	487	0.046875	0.000005	124	442	0.140625	0.000010
40	66	231	0.000000	0.000007	136	496	0.031250	0.000007

Table 3. Numerical results of the MHSCG method and Algorithm 1 in 41–50 dimensions.

	MHSCG Method				Algorithm 1
Dim	NI	NFG	Time	$f^{*}$	NI	NFG	Time	$f^{*}$
41	76	267	0.031250	0.000005	65	239	0.000000	0.000005
42	130	438	0.046875	0.000008	85	302	0.000000	0.000010
43	164	546	0.031250	0.000009	136	488	0.000000	0.000009
44	184	594	0.031250	0.000007	29	103	0.000000	0.000006
45	17	64	0.000000	0.000000	17	64	0.000000	0.000000
46	155	519	0.031250	0.000010	62	222	0.000000	0.000000
47	123	421	0.031250	0.000010	107	379	0.031250	0.000010
48	20	74	0.000000	0.000000	68	250	0.015625	0.000010
49	135	473	0.031250	0.000010	102	375	0.031250	0.000010
50	178	594	0.062500	0.000010	151	542	0.031250	0.000009

Table 4. Test results for Algorithm 1 and the MHSCG method.

		Algorithm 1		MHSCG Method
Image	Noise Level	PSNR	CPU Time	PSNR	CPU Time
Barbara	$20 %$	29.6638	2.265625	29.5831	2.578125
Baboon	$20 %$	27.9223	2.609375	27.8455	3.250000
Barbara	$60 %$	23.1256	3.593750	23.1103	3.765625
Baboon	$60 %$	21.1836	3.484375	21.1582	3.671875

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, S.; Wang, X.; Tian, Y.; Pang, L. A New Hybrid Descent Algorithm for Large-Scale Nonconvex Optimization and Application to Some Image Restoration Problems. Mathematics 2024, 12, 3088. https://doi.org/10.3390/math12193088

AMA Style

Wang S, Wang X, Tian Y, Pang L. A New Hybrid Descent Algorithm for Large-Scale Nonconvex Optimization and Application to Some Image Restoration Problems. Mathematics. 2024; 12(19):3088. https://doi.org/10.3390/math12193088

Chicago/Turabian Style

Wang, Shuai, Xiaoliang Wang, Yuzhu Tian, and Liping Pang. 2024. "A New Hybrid Descent Algorithm for Large-Scale Nonconvex Optimization and Application to Some Image Restoration Problems" Mathematics 12, no. 19: 3088. https://doi.org/10.3390/math12193088

APA Style

Wang, S., Wang, X., Tian, Y., & Pang, L. (2024). A New Hybrid Descent Algorithm for Large-Scale Nonconvex Optimization and Application to Some Image Restoration Problems. Mathematics, 12(19), 3088. https://doi.org/10.3390/math12193088

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A New Hybrid Descent Algorithm for Large-Scale Nonconvex Optimization and Application to Some Image Restoration Problems

Abstract

1. Introduction

2. Motivation, Algorithm, and Sufficient Descent Property

3. Convergence for General Nonlinear Functions

4. Numerical Performance

4.1. Performance on Benchmark Problems

4.2. Comparison for Stability

4.3. Application to Image Restoration

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI